Upload
others
View
2
Download
0
Embed Size (px)
Citation preview
ABBRATE: AUTOMATING BBR CONGESTIONCONTROL ATTACK EXPLORATION USING A
MODEL-BASED APPROACH
by
Anthony Peterson
A ThesisSubmitted to the Faculty of the
Khoury College of Computer Sciencesof Northeastern University
In Partial Fulfillment of theRequirements for the Degree of
Master of Science
Boston, MassachusettsMay, 2019
c� Anthony Peterson 2019All rights reserved
ii
Acknowledgments
I would like to thank:
Dr. Cristina Nita-Rotaru, my advisor, for her guidance and support. Not only did
Cristina introduce me to interesting topics in computer science, but also taught me
how to be a better researcher. When I first decided to pursue graduate school, it
was because I wanted to learn how to do research. Cristina taught me how to find
relevant research ideas and how become a better independent researcher.
Dr. David Cho↵nes and Dr. Endadul Hoque, my readers and committee members,
for their support throughout this work. Endadul’s insight into teaching me how to
run experiments, as well as obtaining and analyzing results were incredibly helpful
throughout the entirety of this work. Dave’s insight into writing and discussing data
presented in figures was incredibly helpful. His attention to detail and determination
for understanding data was an important lesson and one that will remain with me
throughout my career.
Dr. Samuel Jero for ideas, encouragement, tutelage, answering my seemingly endless
emails, teaching me valuable low-level concepts about computer networks and for the
BBR finite-state machine.
Nicole B. Keefner for patiently listening to me talk about my research and helping
me prepare for my thesis defense.
My friends and family for their support during my graduate education.
The citizens of these United States, for funding my graduate education through the
National Science Foundation Scholarship for Service Fellowship.
iii
Table of Contents
Page
List of Tables . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . vi
List of Figures . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . vii
Abstract . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . ix
1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1
1.1 Thesis Contributions . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4
1.2 Outline . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4
2 Congestion Control . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6
2.1 Congestion Control Overview . . . . . . . . . . . . . . . . . . . . . . . 6
2.1.1 Inferring Congestion . . . . . . . . . . . . . . . . . . . . . . . . 6
2.1.2 Acknowledgment Packets . . . . . . . . . . . . . . . . . . . . . . 7
2.1.3 Discovering the Target Sending Rate . . . . . . . . . . . . . . . 7
2.1.4 Achieving Fairness with Competing Flows . . . . . . . . . . . . 8
2.2 Congestion Signal . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8
2.2.1 Re-transmission Timeout . . . . . . . . . . . . . . . . . . . . . . 8
2.2.2 Duplicate Acknowledgments . . . . . . . . . . . . . . . . . . . . 9
2.2.3 Delayed Acknowledgments . . . . . . . . . . . . . . . . . . . . . 10
2.2.4 Adapting to Congestion . . . . . . . . . . . . . . . . . . . . . . 10
2.3 A State Machine for Congestion Control . . . . . . . . . . . . . . . . . 11
2.3.1 Slow Start . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 11
2.3.2 Congestion Avoidance . . . . . . . . . . . . . . . . . . . . . . . 12
2.3.3 Fast Recovery . . . . . . . . . . . . . . . . . . . . . . . . . . . . 12
2.3.4 Exponential Backo↵ . . . . . . . . . . . . . . . . . . . . . . . . 12
2.4 Congestion Control and Avoidance Schemes . . . . . . . . . . . . . . . 13
2.4.1 TCP Tahoe . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 13
iv
Page
2.4.2 TCP New Reno . . . . . . . . . . . . . . . . . . . . . . . . . . . 14
2.4.3 TCP Vegas . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 14
2.4.4 TCP Compound . . . . . . . . . . . . . . . . . . . . . . . . . . 15
2.4.5 TCP CUBIC . . . . . . . . . . . . . . . . . . . . . . . . . . . . 15
2.5 Congestion Control Attacks . . . . . . . . . . . . . . . . . . . . . . . . 16
3 BBR Overview & Vulnerabilities . . . . . . . . . . . . . . . . . . . . . . . . . 17
3.1 BBR Overview . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 17
3.2 Estimating the Network Path Model . . . . . . . . . . . . . . . . . . . 19
3.2.1 Bottleneck Bandwidth . . . . . . . . . . . . . . . . . . . . . . . 19
3.2.2 Round-Trip Propagation Delay . . . . . . . . . . . . . . . . . . 20
3.3 Rate Limiting . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 21
3.4 A State Machine for BBR . . . . . . . . . . . . . . . . . . . . . . . . . 24
3.4.1 Startup . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 24
3.4.2 Drain . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 25
3.4.3 ProbeBW . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 25
3.4.4 ProbeRTT . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 26
3.4.5 Recovery . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 26
3.4.6 Exponential Backo↵ . . . . . . . . . . . . . . . . . . . . . . . . 26
3.4.7 Rate Limited . . . . . . . . . . . . . . . . . . . . . . . . . . . . 27
3.5 Example Attack Against BBR . . . . . . . . . . . . . . . . . . . . . . . 27
4 Automated Attack Exploration in BBR . . . . . . . . . . . . . . . . . . . . . 28
4.1 Attacker Model . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 28
4.2 Modifying TCPwn for BBR . . . . . . . . . . . . . . . . . . . . . . . . 29
4.3 State Inference for BBR . . . . . . . . . . . . . . . . . . . . . . . . . . 30
5 Experimental Results . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 33
5.1 Experiment Setup . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 33
5.1.1 Environment . . . . . . . . . . . . . . . . . . . . . . . . . . . . 33
5.1.2 Attack Strategies . . . . . . . . . . . . . . . . . . . . . . . . . . 35
5.1.3 Attack Analysis . . . . . . . . . . . . . . . . . . . . . . . . . . . 36
5.2 Discovered Attacks on BBR . . . . . . . . . . . . . . . . . . . . . . . . 37
5.2.1 Attack 1 – Optimistic Acknowledgments . . . . . . . . . . . . . 38
5.2.2 Attack 2 – Delayed Acknowledgments . . . . . . . . . . . . . . . 42
v
Page
5.2.3 Attack 3 – Repeated RTO . . . . . . . . . . . . . . . . . . . . . 45
5.2.4 Attack 4 – Re-transmission Timeout Stall . . . . . . . . . . . . 50
5.2.5 Attack 5 – Sequence Number Desync . . . . . . . . . . . . . . . 53
5.3 Ine↵ective Congestion Control Attacks Against BBR . . . . . . . . . . 56
5.3.1 Acknowledgment Bursts . . . . . . . . . . . . . . . . . . . . . . 56
5.3.2 Acknowledgment Division . . . . . . . . . . . . . . . . . . . . . 57
5.3.3 Duplicate Acknowledgments . . . . . . . . . . . . . . . . . . . . 58
6 Related Work . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 60
6.1 BBR Congestion Control Analysis . . . . . . . . . . . . . . . . . . . . . 60
6.2 Congestion Control Attacks . . . . . . . . . . . . . . . . . . . . . . . . 61
6.3 Protocol Fuzzing . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 62
7 Conclusion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 63
References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 65
vi
List of Tables
Table Page
3.1 BBR finite-state machine variable and event descriptions . . . . . . . . . . 22
5.1 Discovered attack classes targeting TCP BBR congestion control . . . . . . 34
5.2 Optimistic ACK attack sending rate over time . . . . . . . . . . . . . . . . 37
vii
List of Figures
Figure Page
1.1 Loss-based congestion control in today’s networks . . . . . . . . . . . . . . 2
2.1 Re-transmission timeout and duplicate acknowledgment time lines . . . . . 9
3.1 Optimal vs. loss-based operating points . . . . . . . . . . . . . . . . . . . 18
3.2 Linux TCP BBR v1.0 finite-state machine . . . . . . . . . . . . . . . . . . 23
5.1 TCPwn testing environment . . . . . . . . . . . . . . . . . . . . . . . . . 34
5.2 CDF of discovered attack classes executed 100 times . . . . . . . . . . . . 37
5.3 Optimistic acknowledgments attack timeline . . . . . . . . . . . . . . . . . 38
5.4 Attack 1 – Optimistic Acknowledgments Attack Time-Sequence GraphDetailed . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 41
5.5 Attack 1 – Optimistic Acknowledgments Attack Time-Sequence Fairness . 41
5.6 Delayed acknowledgments attack timeline . . . . . . . . . . . . . . . . . . 42
5.7 Attack 2 – Delayed Acknowledgments Attack Time-Sequence Graph Detailed44
5.8 Attack 2 – Delayed Acknowledgments Attack Time-Sequence Fairness . . . 44
5.9 Delayed ACK attack BBR vs. New Reno comparison . . . . . . . . . . . . 45
5.10 Acknowledgment-based manipulation actions that caused re-transmissiontimeouts. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 46
5.11 Repeated Re-transmission timeout attack timeline . . . . . . . . . . . . . . 47
5.12 Attack 3 – Repeated Re-transmission Timeout Attack Time-SequenceGraph Detailed . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 49
5.13 Attack 3 – Repeated Re-transmission Timeout Attack Time-Sequence Fairness49
5.14 Re-transmission timeout stall attack timeline . . . . . . . . . . . . . . . . . 50
5.15 Attack 4 – Re-transmission Stall Attack Time-Sequence Graph Detailed . . 52
5.16 Attack 4 – Re-transmission Stall Attack Time-Sequence Fairness . . . . . . 52
5.17 Sequence number desync stall attack timeline . . . . . . . . . . . . . . . . 53
5.18 Attack 5 – Sequence Number Desync Attack Time-Sequence Graph Detailed55
viii
Figure Page
5.19 Attack 5 – Sequence Number Desync Attack Time-Sequence Fairness . . . 55
5.20 Acknowledgment burst attack timeline . . . . . . . . . . . . . . . . . . . . 56
5.21 Acknowledgment division attack timeline . . . . . . . . . . . . . . . . . . . 57
5.22 Duplicate acknowledgment attack timeline . . . . . . . . . . . . . . . . . . 58
ix
Abstract
BBR is a new congestion control algorithm proposed by Google that builds a
model of the network path consisting of its bottleneck bandwidth and RTT to gov-
ern its sending rate rather than packet loss (like CUBIC and many others). While
acknowledgment manipulation attacks are well known to be e↵ective for loss-based
congestion control, no prior work has investigated how to implement such attacks for
BBR, nor how e↵ective they are in practice. In this thesis we systematically analyze
the vulnerability of BBR to acknowledgement manipulation attacks. We create the
first detailed BBR finite state machine and a novel algorithm for inferring its current
BBR state at runtime by passively observing network tra�c. We then adapt and
apply a TCP fuzzer to the Linux TCP BBR v1.0 implementation. Our approach
generated 30,297 attack strategies, of which 8,859 misled BBR about actual network
conditions. From these, we identify 5 classes of attacks causing BBR to send faster,
slower or stall.
Advisor:
Dr. Cristina Nita-Rotaru
Readers:
Dr. David Cho↵nes
Dr. Endadul Hoque
1
1 Introduction
BBR (Bottleneck Bandwidth and Round-trip propagation time) is a new conges-
tion control algorithm for TCP [1] and QUIC [2] proposed by Google in 2016. BBR is
motivated by how commonly deployed loss-based congestion control algorithms inac-
curately rely on packet loss as the primary signal for network congestion, often leaving
networks underutilized or highly congested. In today’s networks, the relationship be-
tween packet loss and network congestion has become disjoint due to varying switch
bu↵er sizes. Figure 1.1 illustrates the limitation of loss-based congestion control in
today’s networks.
Instead, BBR is model-based, as it creates a model of the network by periodically
estimating the available bottleneck bandwidth BtlBw and round-trip propagation de-
lay RTprop, which are used to govern the rate packets are sent into the network and
the maximum amount of data allowed in-transit.
Prior work [3–6] showed how loss-based congestion control schemes (e.g.,, New
Reno, CUBIC) designed for TCP are prone to acknowledgment manipulation attacks,
where an adversary exploits the semantics of acknowledgments to mislead the sender
(i.e., the victim) of a target flow about network congestion. These attacks are possible
because TCP headers are unencrypted and include no authentication mechanism
other than a random initial sequence number which may be observed or predicted by
on-path [3] or o↵-path [7–9] attackers, respectively. While at first it may appear BBR
is less prone to such attacks, as it relies on a di↵erent congestion control approach,
its estimation of BtlBw and RTprop is based on received acknowledgments. Their
impact however can not be easily assessed from existing attacks against loss-based
2
packet loss
decreased sending rateafter buffer fills and queue delay
decreased sending rateleaves link underutilized
time
buffer fills
Figure 1.1.: This figure illustrates how packet loss and actual congestion can occurat di↵erent times in today’s network, which loss-based congestion control assumes.In deep bu↵ers (left), packet loss occurs after the bu↵er fills, causing excessive queuedelay before backing o↵. In shallow bu↵ers (right), packet loss occurs early causingsenders to overly decrease their sending rate leaving the link underutilized.
congestion control, because BBR follows a di↵erent algorithm for adjusting its sending
rate. Given BBR is implemented for TCP [10], the underlying protocol for much of
the Internet tra�c, and being deployed on YouTube and Google.com [11], studying
BBR security is essential.
In this work, we discover and analyze acknowledgment manipulation attacks tar-
geting the Linux TCP BBR congestion control implementation. We systematically
found attacks that target the core mechanism of BBR: the estimation of BtlBw and
RTprop. We use a protocol-fuzzing approach to inject counterfeit packets at runtime,
following a systematic approach that first finds all possible state machine attack paths,
which are mapped to actual acknowledgment manipulation (i.e., concrete) attacks.
We use TCPwn, a TCP congestion control protocol fuzzer, to automatically
find vulnerabilities targeting BBR. TCPwn attack strategies are defined by tuplets
that dictate which type of acknowledgment manipulation attack to execute when
the sender is in a certain congestion control state. TCPwn uses the model of the
congestion control algorithm to map all theoretically possible attack paths to actual
attack strategies. It then uses a state inference algorithm by observing network tra�c
to discern when to inject the counterfeit acknowledgments. Since TCPwn supports
3
only loss-based congestion control algorithms, we derive a finite state machine for BBR
by consulting documentation [11–13], presentations [14–19] and source code [10]. We
additionally develop a new algorithm for inferring the current BBR state in real-time
based on network tra�c alone, and integrate it with TCPwn.
Using this approach, we automatically generated and executed 30,297 attack
strategies, of which 8,859 caused BBR to send data at abnormal rates: 14 caused
a faster sending rate, 4,025 caused a slower sending rate and 4,820 caused a stalled
connection (i.e., the flow did not complete). All of these attacks originated from an
on-path attacker with read/write access to the flow. Attacks causing slower/stalled
sending performance could be used by an adversary to throttle other flows—leading to
poor performance for victim flows and possibly making more bandwidth available to
the attacker’s flows. Those causing faster sending performance could be used by a de-
structive adversary to increase network congestion, leading to unfairness, poor quality
of service and congestion collapse. From the 8,859 attack strategies, we identify five
classes of attacks, defined by the type of action against an acknowledgment executed
in a particular protocol state. In this work, we study the vulnerability of BBR to
acknowledgement manipulation attacks. We define acknowledgment manipulation at-
tacks as any injection or modification of an acknowledgment packet. To conduct the
attacks, the attacker needs to craft acknowledgment packets, either by fabrication or
modifying intercepted ones. These crafted acknowledgments must be semantic-aware;
the attacker should be aware of the semantics of the congestion control algorithm and
knowledgeable about the meaning of acknowledged bytes. Therefore, in this work, we
consider an on-path attacker (e.g.,, a switch on the path between client and sender)
capable of injecting spoofed packets as well as intercepting, modifying and controlling
the delivery of legitimate packets in target flows.
To this end, we first analyze BBR’s design, and find all possible paths that can
result in theoretical changes in the sending rate (Section 3). Then, we perform a sys-
4
tematic fuzzing of the Linux TCP BBR implementation, showing how the theoretical
paths can be exploited in implementations (Section 4).
1.1 Thesis Contributions
The contributions of this thesis can be summarized as follows:
• The first finite-state machine model for BBR, used to automatically discovery
and generate acknowledgement-based manipulation attacks.
• An algorithm to infer the current BBR state in real-time by observing network
tra�c alone.
• Adapt a TCP congestion control protocol fuzzer, TCPwn, to BBR using its
finite state machine and state inference algorithm to execute our 30,297 auto-
matically generated attack strategies.
• Identify 5 classes of acknowledgement manipulation attacks against BBR that
cause faster, slower and stalled sending rates: the first published work to knowl-
edge to discover and evaluate such attacks on BBR.
• Analyze how BBR distinctly reacts to these attacks, and how it is di↵erent from
other congestion control algorithms.
• Analyze how previously known attacks against TCP congestion control were
not e↵ective against BBR.
1.2 Outline
The rest of this thesis is organized as follows. In Section 2, we provide a high-
level background of TCP congestion control. In Section 3, we show that BBR is
5
theoretically vulnerable to acknowledgment-based manipulation attacks while also
presenting a new state machine for BBR. In Section 4 we describe our algorithm
for automatically inferring the current state of TCP BBR congestion control from
network tra�c. In Section 5 we present our experimental results, which includes
descriptions of our testing environment, our discovered attacks in BBR and an analysis
of previously known attacks against TCP congestion control that were found to be
ine↵ective against BBR. In Section 6 we discuss related work and in Section 7 we
conclude our work.
6
2 Congestion Control
2.1 Congestion Control Overview
Congestion control is the mechanism that modulates the rate at which data is
sent into a network in order to achieve two main priorities. The first priority is to
saturate (fully utilize) the bottleneck link while avoiding congestion collapse (due to
sending faster than the bottleneck link can support). The network is fully saturated
when the total amount of data in-flight equals the path’s bandwidth-delay product
(BDP), which is the proven optimal operating point of any network [20]. A path’s
BDP is dynamic and typically computed as the product of the bottleneck link’s max-
imum bandwidth and the path’s round-trip time without queue delay [21]. The other
priority of congestion control is fairness, which is accomplishes by dividing the lim-
ited resources of the network such that all flows simultaneously using the network
receive an equal and fair share of the available bandwidth. Accomplishing these goals
is challenging because networks are unpredictable: links vary in bandwidth capacity
and are shared anywhere between few to millions of hosts such as the global Internet.
While several congestion control algorithms have been developed, most adhere to the
same basic principles first described by Jacobson in 1988 [22].
2.1.1 Inferring Congestion
Congestion control must use signals from the network to infer congestion as an in-
dicator to “back o↵”, or reduce its load on the network. The most popular paradigms
7
have been loss-based congestion control and delay-based congestion control. Loss-
based congestion control has dominated the Internet since its creation and uses packet
loss to signal congestion. Packet loss occurs when switches bu↵ers along a network
path fill to capacity and are left with no choice but to discard, i.e., drop, incoming
packets. Delay-based congestion control compares predicted and actual round-trip
time (RTT) samples to signal congestion. These signals, be it packet loss or RTT,
govern the rate at which data is sent into the network.
2.1.2 Acknowledgment Packets
Every byte of data is associated with a unique sequence number [23] that incre-
ments by 1 with each additional data byte. A packet containing data is accompanied
by a sequence number, which represents the sequence of the first byte of the data.
Sequence numbers allow data segments to be easily reassembled and acknowledged
by receivers. Acknowledgment packets are sent from the receiver to explicitly inform
the sender about the highest correctly received data byte thus far. They also implic-
itly inform the sender about current network conditions. An acknowledgment packet
acknowledging sequence X implies all bytes up to but not including X have been
correctly received. This allows the sender to either re-transmit lost segments or send
new data. Receivers typically send one acknowledgment packet for every two received
data packets, which is known as delayed acknowledgments.
2.1.3 Discovering the Target Sending Rate
The target sending rate is one that achieves high throughput and avoids conges-
tion. The sending rate is dictated by a per-connection variable known as the con-
gestion window cwnd which governs the maximum amount of unacknowledged data
allowed in-transit. When a connection first starts, congestion control performs slow
8
start to quickly discover the available bandwidth of the link. Afterwards, congestion
avoidance is performed whereby data is sent conservatively while slowly probing the
network for available bandwidth.
2.1.4 Achieving Fairness with Competing Flows
Since networks today are shared by several end-hosts, congestion control aims to
share the limited resources of a bottleneck link evenly across all flows. Coexisting
congestion control algorithms may not be fair to each other due to di↵ering probing
and backo↵ mechanisms. For example, delay and loss-based congestion control do not
operate well together. Delay-based congestion control has been shown to reduce its
congestion window much earlier than loss-based [24], resulting in unfair bandwidth
allocation. Because of this, delay-based congestion control is not commonly used in
today’s networks.
2.2 Congestion Signal
Loss-based congestion control uses two signals to detect packet loss: re-transmission
timeouts (RTOs) and duplicate acknowledgment packets. Delay-based congestion
control leverages changes in RTT samples to infer congestion. All methods leverage
feedback from the receiver (i.e. acknowledgment packets) to detect congestion.
2.2.1 Re-transmission Timeout
RTOs ensure data delivery when there is an absence in feedback from the receiver.
Each time a data segment is sent, a timer is started and expires if the segment has
not been acknowledged after a certain amount of time (� 1 RTT). On each timeout,
9
DATA 1
re-transmission timeout "no ACK"
DATA 1
DATA 1DATA 2DATA 3DATA 4
ACK 1
ACK 1
ACK 1
sender receiver sender receiver
DATA 5
ACK 13 duplicate ACKs
lost
DATA 2
Figure 2.1.: These are two sequence/ACK time lines illustrating the two primary wayspacket loss is inferred: re-transmission timeouts (left) and duplicate acknowledgments(right).
the data segment is re-transmitted and the timer restarts with a doubled timeout
time. The initial timeout time is typically a function of the path’s RTT. This event
signifies severe congestion due to the inability of any packets to be delivered across
the network.
2.2.2 Duplicate Acknowledgments
The recipient of several duplicate acknowledgments are used to infer that one or
more packets have been lost in the network due to congestion. Because TCP allows
at most one full cwnd into the network at any given time (often consisting of several
packets), a single lost packet will cause all proceeding packets in the window to be
received out-of-order. Whenever a TCP receiver receives an out-of-order packet, its
data contents are ignored and an acknowledgment is sent acknowledging the highest
correctly received packet at that time. Each packet in the windows following the first
lost packet will trigger a single duplicate acknowledgment to be sent, allowing the
sender to infer packets have been lost. Today, TCP senders wait for three duplicate
acknowledgments: enough to provide a good indication packet loss occurred yet small
10
enough to not trigger the re-transmission timer. Duplicate acknowledgments imply
less severe congestion than RTOs because they indicate out-of-order packets are still
able to be delivered to the receiver.
2.2.3 Delayed Acknowledgments
Delay-based congestion control, used by TCP Vegas [25], FAST TCP [26] and
CARD [27], takes a proactive approach for detecting congestion rather than loss-based
which takes a reactive approach after congestion already occurs. The advantage of
delay over loss-based congestion control is that it detects the onset of congestion as
switch bu↵ers grow instead of waiting until they have filled for packet loss to occur.
Delay-based congestion control infers congestion by comparing actual throughput to
expected throughput. If the actual throughout is significantly less than the sending
rate (expected throughput), then the sender assumes congestion exists and subse-
quently backs o↵.
2.2.4 Adapting to Congestion
Congestion control adapts the sending rate based on the above congestion signals,
typically accomplished by adjusting cwnd based on congestion severity. The most
common approach for adjusting cwnd is the additive increase/multiplicative decrease
(AIMD) scheme where cwnd linearly increases on each new ACK to probe for avail-
able bandwidth until a congestion event occurs, where cwnd exponentially decreases
to “back o↵” from the network. When an RTO occurs, cwnd resets to 1 segment
as this usually implies major changes in network conditions. Upon three duplicate
acknowledgments, cwnd halves as this implies small amounts of packet loss. Fairness
is achieved because flows back o↵ at di↵erent times, allowing others to occupy the
available bandwidth. The limitation of AIMD is while it can attain its fair share of
11
bottleneck bandwidth, it backs o↵ shortly after its discovery. The AIMD scheme is
a high-level approach which varies depending on implementation. In Section 2.4, we
summarize how some widely-used congestion control algorithms detect and react to
network congestion.
2.3 A State Machine for Congestion Control
In this section, we present a state machine for congestion control. Although there
many di↵erent congestion control implementations exist today, most adhere to the
following state machine. The state machine below best resembles TCP New Reno [3].
2.3.1 Slow Start
Slow start is an algorithm aimed to quickly discover the available bandwidth of a
network. Since network links today span several orders of magnitude, slow start per-
forms an exponential search by increasing cwnd by 1 segment on each ACK, allowing
cwnd to double every RTT. This continues until either a congestion event occurs or
cwnd � the slow start threshold ssthresh. In fact, slow start takes e↵ect anytime
cwnd < ssthresh. This slow start threshold is initially set to the receive window
rwnd when a connection is first initiated. The first congestion event during slow start
implies cwnd has become too large and congestion avoidance shall be entered, where
cwnd conservatively increases. Anytime cwnd < ssthresh, TCP is in slow start,
otherwise TCP is in congestion avoidance.
12
2.3.2 Congestion Avoidance
Congestion avoidance is an algorithm tasked with operating at the estimated
network capacity while conservatively increasing cwnd to “test” the network for any
available bandwidth. This is accomplished by increased cwnd by 1/cwnd on each
ACK, allowing cwnd to increase by 1 segment every RTT. Upon congestion, the
sender backs o↵ from the network to quickly alleviate congestion by either halving or
resetting cwnd from 1 segment depending on the congestion signal.
2.3.3 Fast Recovery
Fast recovery is performed anytime the sender receives three duplicate acknowledg-
ments, which implies one or more segments have been lost and must be re-transmitted.
In Section 2.2.2 we discuss why duplicate acknowledgments imply packet loss due
to congestion. Since duplicate acknowledgments imply less severe congestion, fast
recovery aims to quickly recovery from the lost segment by halving cwnd and re-
transmitting the lost segment. Fast recovery exits into the previous state once all
unacknowledged data at the time fast recovery was entered is acknowledged.
2.3.4 Exponential Backo↵
This algorithm is performed anytime a re-transmission timeout (RTO) expires,
implying severe network congestion. In Section 2.2.1 we describe re-transmission
timeouts and why they imply packet loss due to severe congestion. Each time the
re-transmission timer expires, the lost segment is re-transmitted and the timer is
restarted with a doubled timeout time. This allows senders to exponentially backo↵
from the network in order to alleviate network congestion altogether. Exponential
backo↵ repeats until a new acknowledgment is received, after which cwnd resets from
13
1 segment and slow start is performed. It should be emphasized how after exponential
backo↵, the sender “resets” from slow start rather than returning to the previous state.
This is because re-transmission timeouts imply network conditions have significantly
changed, meaning returning to the previous state would leave the sender in a stale
state unrelated to the actual network conditions.
2.4 Congestion Control and Avoidance Schemes
In this section we discuss some well-known congestion control schemes. We discuss
their congestion control and avoidance mechanisms and how they di↵er from one
another.
2.4.1 TCP Tahoe
TCP Tahoe was the first widely used congestion control scheme proposed by Ja-
cobsen [22], motivated by the congestive collapse that occurred on the Internet in
1986. This scheme proposed that senders shall obey a “conservation of packets”,
meaning the sender maintains at most a full congestion window cwnd of unacknowl-
edged packets in the network. TCP Tahoe first proposed the slow-start, congestion
avoidance, fast retransmit and exponential backo↵ algorithms, described in Section
2.3. Two signals are used to infer congestion: re-transmission timeouts and the re-
cipient of three duplicate acknowledgments. In the event of packet loss, TCP Tahoe
resets cwnd from 1 packet and the slow-start algorithm is re-entered. It was this
scheme where additive increase/multiplicative decrease (AIMD) first appeared.
14
2.4.2 TCP New Reno
TCP New Reno preserves the same basic principles of TCP Tahoe but includes
a di↵erent mechanism for reacting to duplicate acknowledgments. Since the recipi-
ent of duplicate acknowledgments indicate that out-of-order segments are still being
delivered to the receiver and not being dropped, it indicates network conditions are
not as severe at a transmission timeout indicate. Instead of resetting cwnd from 1
packet and entering slow start, TCP New Reno performs “Fast Recovery” by halving
cwnd and skipping slow start. Since duplicate acknowledgments indicate less severe
congestion, halving cwnd alleviates congestion and allows the sender to more quickly
recovery to its sending rate before the congestion event.
2.4.3 TCP Vegas
TCP Vegas is unique in that it primarily uses changes in RTT measurements to
signal congestion rather than packet loss. TCP Vegas compares expected throughput
vs actual throughput and adjusts cwnd based on the degree to which they di↵er.
Expected throughput represents the sending rate at the time and actual throughput
represents the rate the network was able to delivery packets to the receiver. On each
acknowledgment, TCP Vegas computes d = expected-actual. By definition, d is
always non-negative because the actual throughput should never be greater than the
expected throughput. The closer d is to 0, the better the network was able to sustain
the sender’s sending rate. If d is relatively small (within a certain threshold) then
cwnd increases linearly. If d is moderately large (within another threshold), then
cwnd is unchanged. Otherwise (when d is su�ciently large), cwnd decreases linearly.
It should be emphasized how TCP Vegas proactively detects congestion rather than
reactively detecting congestion after packet loss. TCP Vegas is seldom used today
because research e↵orts have shown that TCP Vegas’ performance su↵ers when co-
15
existing with other loss-based congestion control algorithms or after network path
changes [28, 29].
2.4.4 TCP Compound
TCP Compound is the default TCP congestion control algorithm in Windows that
is motivated by how AIMD grows cwnd too conservatively, often leaving large BDP
networks underutilized. The work in [30] points out that with AIMD, it would take
about 1 hour for cwnd to to grow to fully utilize a 10 Gbps link without any packet
loss. With inevitable packet loss, it would take even longer or perhaps never reach the
target operating point because cwnd would be halved several times as it grows. To
combat this, TCP Compound employs both loss and delay-based congestion control
to control the rate at which cwnd increases. TCP Compound exponentially increases
cwnd and uses delay-based congestion control to detect the onset of congestion in
order to know when to begin growing cwnd linearly. It uses loss-based congestion
control to back o↵ from the network.
2.4.5 TCP CUBIC
TCP CUBIC is currently (2019) the default TCP congestion control algorithm
in the MacOS and Linux kernels. TCP CUBIC primarily di↵ers in its cwnd growth
function in the congestion avoidance stage where rather than increasing linearly, cwnd
grows analogous to a cubic function of time meaning there are three phases to its
growth. In the first stage, cwnd starts growing exponentially to quickly approach
the available bandwidth while transitioning to a plateau. In the second phase, cwnd
remains unchanged meaning data is sent at a steady pace. Finally, the growth of
cwnd transitions from the plateau into growing exponentially to probe for additional
bandwidth yet again, after which the process repeats. If packet loss occurs during
16
any of these phases, cwnd exponentially decreases and the first phase is transitioned
into.
2.5 Congestion Control Attacks
Congestion control is highly susceptible to attacks targeting its sending rate behav-
ior. Since the only feedback senders have about network congestion is acknowledge-
ment packets, an attacker that injects or manipulates acknowledgments can mislead
the sender about network congestion. The impact of such attacks is to make the
sender send slower, faster, or for the connection to stall (meaning no new data is
sent).
There are several attacks targeting congestion control. For example, Savage et
al. [6] describe an attacker who divides an acknowledgment packet, acknowledging
m bytes into n individual acknowledgement packets each acknowledging m/n bytes.
Since cwnd increases on each new ACK, this causes cwnd to grow n times as fast.
Another example is injecting several duplicate acknowledgments to cause a faster
sending rate. This attack relies on the fact that after cwnd halves on three duplicate
acknowledgments, each additional duplicate acknowledgments decreases the number
of in flight bytes. As a result, each such duplicate acknowledgement means that
the congestion control has a larger window of data that can be sent immediately.
Finally, optimistically acknowledging data causes an increased sending rate. In this
attack, data is acknowledged after it is sent but before it is received, causing cwnd to
increase much earlier than otherwise.
The Shrew Attack [31] is a DoS attack that takes advantage of TCP re-transmission
timers. The attacker prevents acknowledgment packets from reaching the sender even-
tually causing the re-transmission timer to expire and cwnd to reset to 1 segment. This
causes the connection to stall or a decreased sending rate if it is executed sequentially.
17
3 BBR Overview & Vulnerabilities
In this section, we provide a high-level overview on BBR congestion control. We
discuss how BBR models the network path in order to converge to the optimal oper-
ating point. We then provide a detailed and comprehensive BBR finite-state machine
(FSM), the first in literature along with descriptions of each state within the FSM.
We then discuss how BBR is prone to acknowledgment-based manipulation attacks,
targeting its network path model estimation.
3.1 BBR Overview
BBR (Bottleneck Bandwidth and Round-trip propagation time) is a new sender-
side congestion control algorithm for TCP and QUIC proposed by Google in 2016
that takes a fundamental new approach to congestion control. BBR is motivated by
how loss-based congestion control algorithms such as CUBIC and New Reno assume
packet loss directly implies network congestion, which is not always the case. As a
result, senders backo↵ long after the onset of congestion or overly react to packet loss
by backing o↵ more than they should. This behavior leads poor network utilization.
Figure 3.1 compares the optimal (BBR’s target) and loss-based operating points.
BBR attempts to converge to the optimal operating point by pacing the rate at
which data is sent to the maximum bottleneck bandwidth to achieve high utilization
and sequentially draining queues to achieve low latency.
Instead of relying solely on packet loss to infer congestion, BBR is model-based
meaning congestion is inferred primarily by two properties of the network path: its
18
Data in-flight
RTT
Delivery
ra
te
minRTT
minRTT+queue
BtlBwbottleneck queue filling
bottleneck queue filling
loss-based
loss-based
optimal operatingpoint (BBR target)
optimal operatingpoint (BBR target)
minRTT+queue
BDP
Figure 3.1.: The figure is to understand how the operating point of loss-based conges-tion control compares to the proven optimal operating point (BBR’s target). Withloss-based congestion control, packet loss does not occur until queues overflow. Loss-based congestion control keeps sending data into the network, filling the bottleneckbu↵er until they become overflowed (signaled through packet loss). This leads tounnecessarily high latency and packet loss. The optimal operating point is when thebottleneck is saturated with no queue buildup.
bottleneck bandwidth BtlBw and round-trip propagation delay RTprop. BBR paces
its sending rate proportionally to BtlBw and aims for at least one BDP = BtlBw ⇥
RTprop worth of data in-flight for full utilization. At any given time, BBR’s sending
rate is limited by one two factors: the congestion window cwnd, or pacing rate.
The pacing rate is always equal to pacing gain ⇥ BtlBw and defines inter-packet
spacing. Pacing, first proposed by Zhang et al. [32], aims to reduces burstiness and
in some situations, o↵ers improved fairness and throughput [33]. BBR caps cwnd to
2 ⇥ BDP to overcome delays in received acknowledgments, which would otherwise
cause BBR to underestimate the bottleneck bandwidth [14].
Obtaining an accurate and up-to-date model of the network path is essential to
BBR’s e↵ectiveness, and thus is updated on every new acknowledgement. Below we
describe how BBR sequentially obtains accurate measurements of BtlBw and RTprop
while accomplishing congestion control.
19
Algorithm 1 Delivery rate samples [13] are computed to estimate the bottleneckbandwidth. For each new ACK, the average ACK rate is computed between when adata segment is sent to when an acknowledgment is explicitly received for it. Deliveryrates are capped by the send rate as data should not arrive at the receiver faster thanit is transmitted.
Input: A data segment P and a BBR connection C.
Output: The delivery rate sample
1: function ComputeDeliveryRateSample(P, C)
2: data acked = C.delivered - P.delivered
3: ack elapsed = C.delivered time - P.delivered time
4: send elapsed = P.sent time - P.first sent time
5: ack rate = data acked / ack elapsed
6: send rate = data acked / send elapsed
7: delivery rate = min(ack rate, send rate)
8: return delivery rate
9: end function
3.2 Estimating the Network Path Model
Accurate measurements of BtlBw and RTprop are obtained sequentially at di↵er-
ent times and network conditions. This is because the network conditions required
to obtain accurate measurements of each parameter interfere with each others mea-
surements. At mutually exclusive times, BBR adjusts its sending rate so the network
conditions are met for each parameter. For BtlBw, the sending rate is increased to
discover available bottleneck bandwidth while for RTprop, the congestion window is
reduced to 4 packets. It is important to note how increasing sending rate to measure
BtlBw may create queues which would create inaccurate RTprop measurements. De-
creasing the sending rate to measure RTprop would not allow available bandwidth to
be discovered.
3.2.1 Bottleneck Bandwidth
The bottleneck bandwidth of the network path connecting the sender and receiver
is estimated by retaining the maximum observed delivery rate sample over the past
20
10 RTTs and storing that value in BtlBw. This mechanism is known as the BtlBw
max filter. A new delivery rate sample is computed on each new ACK and represents
the average ACK rate roughly between when a data segment is sent to when it
is acknowledged. Algorithm 1 is executed whenever a new ACK is received. The
average ACK rate is computed as the ratio between the di↵erence in total amount
of data acknowledged over however long it took for that data to be acknowledged.
When a data segment is sent, the starting time is set to the last ACK’s arrival time
and the end time is when an ACK acknowledging the data segment arrives. This time
is typically very close to 1 RTT however can be greater than 1 RTT because the last
ACK could have arrived before the segment is sent. The di↵erence in total amount of
data acknowledged is the di↵erence in cumulative acknowledged data between those
two ACKs. These samples represent the rate at which data is being received by the
receiver, which is always limited by the bottleneck link. BBR, however, only obtains
an accurate measurement when data is sent at a rate that is greater than or equal to
the actual bottleneck bandwidth for at least 1 RTT. To accomplish this, BBR sends
25% faster than its current BtlBw estimate for 1 RTT every 8 RTTs, approaching the
point where data is sent greater than or equal to the actual bottleneck bandwidth,
obtaining an accurate delivery rate sample.
3.2.2 Round-Trip Propagation Delay
Estimating RTprop is accomplished by employing a min filter that retains the
minimum observed round-trip time sample over the past 10 seconds. Round-trip
samples are measured by computing the elapsed time between when a flight of data is
sent to when it is acknowledged. Accurately measuring the RTT presents a challenge
because packets queued in switch bu↵ers cause increased and inaccurate RTT samples.
To overcome this, BBR drains switch queues by periodically limiting its sending
behavior. After measuring BtlBw, it is entirely possible the bottleneck is already
21
saturated, causing queue build-up. To mitigate this, BBR sends 25% slower than the
current estimated BtlBw immediately after measuring the bottleneck link. BBR also
reduces cwnd to 4 packets every 10 seconds to update RTprop, which we describe in
the following section in greater detail.
3.3 Rate Limiting
BBR attempts to detect token-bucket policers as they can cause data to be sent
faster than the token drain rate leading to high packet loss. In networks with token
bucket policers, it is common to see bursts in tra�c suddenly before tokens are ex-
hausted after which packets are dropped. Due to BBR’s long-lived BtlBw max filter,
the burst rate would cause the estimated bottleneck bandwidth to be greater than
the token drain rate, leading to high packet loss for as long as 10 round-trips. A
token bucket policer is detected when there is significant packet loss and consistent
throughput, after which it paces its sending rate to the estimated token drain rate
for 48 round-trips.
22
Variable Description
bw maximum measured bottleneck bandwidth.
bw est estimated token-bucket drain rate.
cwnd the maximum amount of unacknowledged data allowed in-flight.
cwnd save cwnd before entering recovery so it can be restored when finished.
fullbw boolean indicating when pipe is filled.
idx current index into rmult.
min rtt minimum measured RTT.
min rtt ts timestamp min rtt was measured.
probe ts timestamp ProbeRTT was entered.
rate current pace data is sent.
rmult array containing the 8 pacing gain phases.
Event Description
ACK recipient of an acknowledgment packet, representing the highest correct
byte received.
MaxBW new maximum bottleneck bandwidth is observed.
MinRTT new minimum RTT sample is observed.
New new acknowledgment received, acknowledging previously outstanding
data.
LostPacket TCP packet loss event (3 duplicate ACKs).
RTO Timeout outstanding data has not been acknowledged for many RTTs.
Table 3.1.: Descriptions of variables unique to the BBR finite-state machine, and itsevents (Figure 3.2).
23
Star
tup
cwnd
=bw
*min
_rtt
*(2/
ln(2
))ra
te=
bw*m
in_r
tt*(
2/ln
(2))
bw in
crea
sing
Dra
incw
nd=
bw*m
in_r
tt*(
2/ln
(2))
rate
=bw
*min
_rtt
*(ln
(2)/
2)D
rain
Que
ues
Prob
eBW
cwnd
= b
w*m
in_r
tt*2
rmul
t= [1
.25,
0.75
,1,1
,1,1
,1,1
]ra
te=
bw*m
in_r
tt*r
mul
t[id
x]St
eady
sta
te
Prob
eRTT
cwnd
=4
rate
=bw
*min
_rtt
*1Pr
obe
for
min
RTT
Rate
Lim
ited
cwnd
=es
t_bw
*min
_rtt
*2ra
te=
est_
bw*m
in_r
tt*1
Reco
very
cwnd
=in
_ig
ht*2
avoi
d lo
ss d
urin
g re
cove
ry
bw h
as n
ot in
crea
sed
for
3 ro
unds
-- fullb
w=
1ra
te=
bw*m
in_r
tt*(
ln(2
)/2)
in_
ight
<=
bw*m
in_r
tt-- cw
nd=
bw*m
in_r
tt*2
idx=
rand
(2,7
)ra
te=
bw*m
in_r
tt*r
mul
t[id
x]
min
_rtt
_ts
> 1
0s--- sa
ve_c
wnd
=cw
ndcw
nd=
4ra
te=
bw*m
in_r
tt*1
prob
e_ts
=no
w()
min
_rtt
_ts
< 1
0s &
& p
robe
_ts
> 2
00m
s &
& fu
llbw
>0
&&
est
_bw
==
0-- cw
nd=
save
_cw
ndid
x=ra
nd(2
,7)
rate
=bw
*min
_rtt
*rm
ult[
idx]
Lost
Pack
et-- sa
ve_c
wnd
=cw
ndU
pdat
e in
_ig
htcw
nd=
in_
ight
high
_wat
er=
last
_sen
t
ACK
&&
New
-- Upd
ate
in_
ight
cwnd
=in
_ig
ht*2
ACK
&&
New
&&
pkt
.ack
>=
hig
h_w
ater
-- cwnd
=sa
ve_c
wnd
Expo
nent
ialB
acko
cwnd
=1
loss
> 5
0% &
& a
bs(b
w-p
rev_
bw) <
= 1
/8*b
w &
& 4
roun
ds-- es
t_bw
=ra
te*m
in_r
tt -
drop
sra
te=
est_
b2*m
in_r
ttcw
nd=
est_
bw*m
in_r
tt*2
48 ro
unds
-- est_
bw=
0id
x=ra
nd(2
,7)
rate
=es
t_bw
*min
_rtt
*rm
ult[
idx]
RTO
Tim
eout
-- cwnd
=1
fullb
w=
0rt
o_tim
eout
=2*
rto_
timeo
ut
RTO
Tim
eout
-- cwnd
=1
rto_
timeo
ut=
2*rt
o_tim
eout
ACK
-- cwnd
=1
bw=
0fu
llbw
=0
Init
-- cwnd
=10
rate
=10
*han
dsha
ke_r
tt*(
2/ln
(2))
fullb
w=
0m
in_r
tt_t
s=no
w()
ACK
&&
New
&&
Max
BW-- cw
nd=
bw*m
in_r
tt*(
2/ln
(2))
rate
=bw
*min
_rtt
*(2/
ln(2
))
ACK
&&
New
&&
Min
RTT
-- cwnd
=bw
*min
_rtt
*(2/
ln(2
))ra
te=
bw*m
in_r
tt*(
2/ln
(2))
min
_rtt
_ts=
now
()
ACK
&&
New
&&
Min
RTT
-- cwnd
=bw
*min
_rtt
*(2/
ln(2
))ra
te=
bw*m
in_r
tt*(
ln(2
)/2)
min
_rtt
_ts=
now
()
ACK
&&
New
&&
Max
BW-- cw
nd=
bw*m
in_r
tt*(
2/ln
(2))
rate
=bw
*min
_rtt
*(ln
(2)/
2)
ACK
&&
New
&&
Max
BW-- cw
nd=
bw*m
in_r
tt*2
rate
=bw
*min
_rtt
*rm
ult[
idx]
ACK
&&
New
&&
Min
RTT
-- cwnd
=bw
*min
_rtt
*2ra
te=
bw*m
in_r
tt*r
mul
t[id
x]m
in_r
tt_t
s=no
w()
min
_rtt
_ts
> 1
0s--- sa
ve_c
wnd
=cw
ndcw
nd=
4ra
te=
bw*m
in_r
tt*1
prob
e_ts
=no
w()
ACK
&&
New
&&
Min
RTT
-- rate
=bw
*min
_rtt
*1m
in_r
tt_t
s=no
w()
min
_rtt
_ts
< 1
0s &
& p
robe
_ts
> 2
00m
s &
& fu
llbw
==
0-- cw
nd=
save
_cw
ndra
te=
bw*m
in_r
tt*(
2/ln
(2))
min
_rtt
_ts
> 1
0s--- sa
ve_c
wnd
=cw
ndcw
nd=
4ra
te=
bw*m
in_r
tt*1
prob
e_ts
=no
w()
min
_rtt
_ts
< 1
0s &
& p
robe
_ts
> 2
00m
s &
& e
st_b
w >
0-- cw
nd=
save
_cw
nd
ACK
&&
New
&&
Min
RTT
-- cwnd
=bw
*min
_rtt
*2ra
te=
bw*m
in_r
tt*1
min
_rtt
_ts=
now
()
1 ro
und
-- idx=
(idx+
1)%
8ra
te=
bw*m
in_r
tt*r
mul
t[id
x]
RateLimited
Prob
eBW
Drain
Startup
Recovery
Expo
nential
Backo
Prob
eRTT
Figure
3.2.:TCPBBR
finite-statemachine,
seeTab
le3.1forvariab
ledescription
s.
24
3.4 A State Machine for BBR
To systematically analyze BBR, we derive a finite-state machine (FSM) for it.
To the best of our knowledge no such model has been published, so we empirically
developed our own to aid attack discovery through documentation [11–13], presenta-
tions [14–19] and source code [10]. In Figure 3.2, we illustrate our BBR finite-state
machine and describe its variables and events in Table 3.1.
BBR employs several similar mechanisms to traditional congestion control algo-
rithms, which we model in Section 2.3. When a flow first begins, BBR employs a
mechanism to quickly discover the available bandwidth (i.e. slow start). Afterwards,
BBR paces its sending rate at the estimated bandwidth (i.e. congestion avoidance)
while simultaneously probing the network for available bandwidth and updating its
network path model: BtlBw and RTprop. While packet loss is not at BBR’s core,
BBR includes mechanisms to handle such cases, similar to Fast Recovery and Ex-
ponential backo↵. Finally, BBR includes methods for detecting and accounting for
token-bucket policers, as they can allow tra�c bursts until tokens run up, making
BBR to send too quickly causing packet loss. Below we briefly describe the states of
our BBR FSM:
3.4.1 Startup
Similar to slow start, Startup is the first state BBR enters and aims to quickly dis-
cover the available bottleneck bandwidth. Since the available bandwidth is unknown
to end hosts and can span several orders of magnitude, BBR performs an exponential
search by doubling its sending rate on each round-trip. BBR exits Startup when de-
livery rate samples begin to plateau, indicating the network limit has been reached.
Startup transitions into Drain when either cwnd reaches ssthresh or if three consec-
25
utive delivery rate samples show less than a 25% increase over the last, indicating the
bottleneck bandwidth has been reached.
3.4.2 Drain
This state aims to quickly drain queues that were likely created during Startup.
Since the sending rate grew exponentially while in Startup, the actual bottleneck was
probably exceeded, causing queues to form. Drain reduces those queues in a single
round-trip by sending data at the inverse rate used in Startup, after which ProbeBW
is entered.
3.4.3 ProbeBW
Similar to congestion avoidance, ProbeBW aims to: pace its sending rate to the
estimated bottleneck bandwidth, achieve fairness with other flows, and probe for
additional bandwidth. In addition, BBR aims to achieve these goals with little-to-no
queue delay. These are accomplished using gain cycling where pacing gain cycles
through a set of eight phases: [5/4, 3/4 , 1, 1, 1, 1, 1, 1], each lasting one RTprop.
In the first phase pacing gain = 5/4, meaning BBR will send data 25% faster than
BtlBw to probe for additional bandwidth. In the second phase, BBR sends 25% slower
than BtlBw to drain any queues created in the last phase and to achieve fairness with
other flows. In the remaining phases, BBR sends equal to BtlBw; the current perceived
bottleneck bandwidth of the path.
26
3.4.4 ProbeRTT
The goal of this state is to obtain a recent and accurate measurement of RTprop,
which is the round-trip propagation delay of the path without queue delay. Since
queue delay increases the measured RTprop, ProbeRTT explicitly backs o↵ from the
network in order to drain any queues. This way, the min RTprop filter can capture
a RTprop measurement without queues. ProbeRTT is entered if 10 seconds have
elapsed since RTprop was last updated, and lasts for 200 ms; long enough to overlap
with many other flow’s (with di↵ering RTTs) ProbeRTT state such that queues are
drained synchronously.
3.4.5 Recovery
BBR’s recovery state is entered under the same circumstance as fast recovery
(described in Section 2.3.3), which is when three duplicate acknowledgments are re-
ceived. The two di↵er in how cwnd is modified. With fast recovery, cwnd is halved,
however with BBR cwnd is set to in flight during recovery and restored to 2 ⇥
BDP afterwards. It should be noted that in recovery, BBR does not backo↵ from the
network (like loss-based congestion control) but rather prevents the amount of data
in-flight from increasing during recovery.
3.4.6 Exponential Backo↵
This state is entered upon a re-transmission timeout (RTO) indicating lost data
due to no new acknowledgments for several RTTs. This state re-transmits the lost
segment with a doubled timeout time; exponentially backing o↵ from the network.
Once an acknowledgment is received, the current path model is discarded and Startup
is entered.
27
3.4.7 Rate Limited
This state is implemented in Linux TCP BBR and entered when a token-bucket
policer is detected on the network, as these can lead to high amounts of packet
loss. This state is entered when the packet loss-to-delivered ratio is >20%, but the
throughput remains steady. BBR sets BtlBw to the estimate token bucket drain rate
and sustains this for 48 round-trips.
3.5 Example Attack Against BBR
At a first glance, it may appear that BBR is less prone to acknowledgement based
manipulation attacks since it is di↵erent from loss-based congestion control algo-
rithms. However, our FSM model illustrates how the received acknowledgments can
influence the transitions between states and the estimation of BtlBw and RTprop,
reinforcing our hypothesis that BBR is also susceptible to acknowledgement manip-
ulation attacks. For instance, when the sender is in ProbeBW state, if the attacker
sends ACKs for data that have not been received by the receiver yet, then the sender
will overestimate BtlBw and consequently increase the sending rate. Similarly, de-
laying ACKs when the sender is in ProbeBW state will inflate ack elapsed which
in turn will trick the sender to underestimate delivery rate as well as BtlBw and
eventually reduce the sending rate. Considering these conceptual attacks, we apply
fuzzing on BBR to find such manipulation attacks in an automated fashion.
28
4 Automated Attack Exploration in BBR
The main goal of this thesis is to systematically examine vulnerabilities of the
Linux TCP BBR v1.0 implementation [10]. We focus on acknowledgement-based
attacks conducted by an on-path attacker and apply a TCP congestion control fuzzer,
TCPwn. Below we describe the attacker model and the changes we had to make to
TCPwn in order to apply it to BBR.
4.1 Attacker Model
We focus on manipulation attacks in the implementation of BBR, where the at-
tacker aims to mislead the sender’s congestion control about the current network
conditions. These attacks are conducted through maliciously crafted/modified ac-
knowledgment packets, which can result in either increasing, decreasing or stalling
the throughput of the target flow.
In order to achieve its goals, the attacker applies an attack strategy, which is de-
fined as a sequence of acknowledgment-based malicious actions and the corresponding
sender states when each action is conducted. In this work, we focus on TCP flows
with bulk data transfers because they are widespread, and the e↵ect of the conducted
attacks is easy to measure.
29
4.2 Modifying TCPwn for BBR
We leverage TCPwn [3], a recent open-source platform designed to automat-
ically find manipulation attacks in TCP congestion control implementations. We
chose TCPwn because it does not require the source code of the congestion control
implementation, and is designed specifically for TCP congestion control implementa-
tions.
At the core, TCPwn employs a network protocol fuzzer to find acknowledgment-
based manipulation attacks against TCP congestion control implementations. Instead
of applying random fuzzing, TCPwn guides the fuzzer using a model-guided tech-
nique, where the model is represented as a finite state machine (FSM) that captures
the main functionality of several TCP congestion control algorithms.
For fuzzing an actual implementation of TCP congestion control in its native
environment, TCPwn utilizes virtualization and proxy-based attack injection. To
be e↵ective, these attacks must be executed at the right time during execution, and
therefore TCPwn monitors network packets exchanged to infer the current state of
the sender in real-time.
While TCPwn is amenable to TCP congestion control algorithms, it assumes
that the algorithm is a loss-based model based on TCP New Reno. Thus, we can
not directly apply it to BBR, as the models are substantially di↵erent. We leverage
our own BBR FSM (see Figure 3.2) to feed it as an input to TCPwn to generate
abstract attack strategies, each of which specifies a vulnerable path in the FSM that
the attacker can exploit. Each transition on a vulnerable path dictates the network
condition that the attacker needs to trigger to mount the desired attack.
While there are several ways to trigger the necessary network conditions, TCPwn
takes the abstract strategies and converts them into concrete attack strategies con-
sisting of basic acknowledgement-packet-level actions (e.g.,, send duplicate ACKs).
During fuzzing, the attack injector applies these actions in particular states of the
30
FSM. Although the generation of attack strategies is fully automated, TCPwn re-
quires us to provide a manually crafted mapping between network conditions and
basic actions because the mapping relies on domain knowledge about the underlying
model (in our case, the BBR FSM).
Another change we had to make is changing the state inference algorithm. TCPwn
must be able to discern the state of the sender’s congestion control in order to inject
attacks at the correct times. The state inference available in TCPwn cannot infer
BBR’s states because the algorithm expects the underlying model (i.e., FSM) to be
TCP New Reno. Hence, we develop a new state inference algorithm for BBR to infer
the sender’s state from network tra�c alone, described in Algorithm 4.3.
4.3 State Inference for BBR
We present a novel algorithm to infer the current state in real-time by passively
observing network tra�c, which we describe in Algorithm 2. Our algorithm operates
by computing flow metrics on each round-trip and comparing metrics across intervals
to determine BBR’s state. We compute metrics on each round-trip because BBR
sustains a constant sending behavior for at least one round-trip.
When our algorithm first starts, we begin a round-trip by recording the first data
packet’s sequence number and end when it is acknowledged. During round-trips, we
collect flow metrics and compute average throughput, re-transmission count, number
of data packets sent when the round completes. We then update the inferred BBR
state on each new round by computing metrics across round-trips. When it comes to
BBR, the most revealing metric about its current state is change in average through-
put across round-trips. On each round, we compute the throughput ratio since the
last round. For example, if the current and last rounds had average throughputs
of 30 and 20 Mbit/sec respectively, then the ratio would equal 1.5. We also infer
31
the current state using the currently inferred state, following BBR’s natural state
transitions (e.g., Startup is followed by Drain).
We infer Startup on lines 42–44 if throughput has increase significantly since the
last round-trip. Drain is primarily inferred if the current state is Startup and we notice
throughput has not been increasing. ProbeBW is inferred if 1.4 > ratio > 0.6, which
allows ProbeBW to be inferred during phases 1 and 2 of gain cycling. We also infer
ProbeBW when BBR transitions out of Drain (lines 39–41), indicated by a significant
increase in throughput. We infer ProbeRTT when we observe only 4-5 data packets in
the last round, resulting from cwnd = 4 packets to drain queues, and exit after 10 data
packets have been sent. Recovery is inferred when re-transmitted segments have been
observed and exits when the highest data sequence (when Recovery was entered) is
acknowledged. Exponential-backo↵ is inferred when the estimated RTO has elapsed
since the last data packet. Lastly, we infer RateLimited when ¿16 round-trips have
passed and there has been little variance in average throughput.
We infer Exponential-backo↵, Recovery and Drain without waiting for a round-
trip to complete, as these can be entered at anytime during the flow.
32
Algorithm 2 BBR state inference algorithm1: function OnNewBBRPacket(packet)2: newInt = CheckInterval(packet) /* a round-trip has passed */3: if dataPackets > 0 and RTOPassedForDataPacket() then
4: priorState = currentState5: currentState = EXPONENTIAL BACKOFF
6: ComputeIntervalMetrics()
7: else if retransmissions > 0 and not currentState == RECOVERY then
8: priorState = currentState9: currentState = RECOVERY
10: /* leave recovery when high water is ACK’d */11: highWater = highestDataSequence12: else if currentState == RECOVERY then
13: ratio = ThroughputRatioSinceLastRoundTrip()
14: if totalAcked � highWater then
15: currentState = priorState /* return to previous state */16: SetRetransmissionCount(0)
17: else if newInt and priorState == STARTUP and 0.7 > ratio > 0.1 then
18: priorState = DRAIN
19: end if
20: else if currentState == PROBERTT and dataPackets > 10 then
21: currentState = priorState22: else if newInt then23: currentState = UpdateBBRNewInterval()
24: else if dataPackets � 10 then
25: /* check for drain on frequent basis */26: ratio = ThroughputRatioSinceLastRoundTrip()
27: if currentState == STARTUP and 0.7 > ratio > 0.1 then
28: priorState = currentState29: return DRAIN
30: end if
31: end if
32: end function
33: function UpdateBBRNewInterval()
34: ratio = ThroughputRatioSinceLastRoundTrip()
35: dataPackets = GetDataPacketCount()
36: nonIncrease = NonIncreaseIntervalCount()
37: if 6 > dataPackets > 3 then /* small amount of data in-flight */38: return PROBERTT
39: else if currentState == DRAIN and ratio > 1.4 then
40: priorState = currentState41: return PROBEBW
42: else if not currentState == PROBEBW and ratio > 1.4 then
43: priorState = currentState44: return STARTUP
45: else if currentState == STARTUP and 0.7 > ratio > 0.1 then
46: priorState = currentState47: return DRAIN
48: else if currentState == STARTUP and nonIncrease > 10 then
49: priorState = currentState50: return DRAIN
51: else if intervalCount > 16 and throughputV ariance < 100 then
52: priorState = currentState53: return RATE LIMIT
54: else if not currentState == STARTUP and 1.4 > ratio > 0.6 then
55: priorState = currentState56: return PROBEBW
57: else
58: return currentState59: end if
60: end function
61: function CheckInterval(packet)62: if not inInterval and IsDataPacket(packet) then
63: ClearIntervalData()
64: ackMark = packet.sequenceNumber65: inInterval = true
66: end if
67: if inInterval and packet.ackNumber � ackMark then
68: inInterval = false
69: ComputeIntervalMetrics()
70: return true
71: end if
72: return false
73: end function
33
5 Experimental Results
In this section we describe and analyze our discovered attacks on BBR congestion
control. We first describe the testing environment used. We then describe how we
analyze and classified the attacks. Lastly, we discuss and illustrate the discovered
attack classes in detail.
5.1 Experiment Setup
5.1.1 Environment
Our testing environment, shown in Figure 5.1, consists of four virtual machines
running Ubuntu Linux 17.10 sharing a virtual dumbell network topology. We limit
the bottleneck bandwidth to 100Mbits/sec with a 500 packet queue and a 10 ms. end-
to-end latency between either end of the topology. We configure the virtual network
with reasonably low latency and high bandwidth, allowing us to isolate the impact
of attack strategies on BBR in a “friendly” environment.
For each attack strategy, two TCP flows are instantiated, a victim and background
flow, each transferring an identical 100MB file over HTTP. The attacker is located
between either end of the topology. The victim flow uses a Linux TCP BBR sender,
whose flow is injected with attacks where the background flow is not targeted. We
use tcpdump to measure both flow’s performance, captured between the senders and
the bottleneck. The victim flow is measured to understand the impact of the attack,
34
No. Attack Attacker Description Result
1 Optimistic
ACK
On/o↵-
path
Acknowledge the highest sent sequence number before it is
received, hiding all losses. This causes an overestimated BtlBw
and for data to be send earlier/faster than otherwise.
Faster
2 Delayed
ACK
On-path Delay acknowledgments from reaching the sender from the re-
ceiver for a fixed amount of time, causing BtlBw to be under-
estimated and data to be sent at a slower pace.
Slower
3 Repeated
RTO
On-path For the entire flow, prevent new data from being acknowl-
edged causing a RTO. Optimistically acknowledge the lost seg-
ment causing Startup to be entered. This causes substantial
amounts of time to be wasted (not sending data) during peri-
ods before each RTO.
Slower
4 RTO stall On-path Prevent new data from ever being acknowledged causing a
RTO and exponential backo↵ to never exit. This causes the
connection to stall as no new data will ever be sent.
Stalled
5 Sequence
Number
Desync
On/o↵-
path
Acknowledge the highest sent sequence number before it is
received, hiding all losses causing sequence numbers to desyn-
chronize. Induce RTO causing exponential backo↵ to be en-
tered. For each re-transmission, the receiver replies with a
lower acknowledgement number than the sender expects, caus-
ing the lost segment to never be acknowledged and exponential
backo↵ to never exit.
Stalled
Table 5.1.: Descriptions of discovered attack classes targeting Linux TCP BBR con-gestion control.
BBR FSM
Attack Strategy Generator
...
Attack Coordinator
{cwnd, rate}
Attacker
AttackStrategy
FSM State Tracker
Victim TCPBBR Sender
Victim TCP Receiver
Victim TCP Receiver
BackgroundTCP Sender
Current State
- Packets
ResultsSending Rate
Attack strategies
Figure 5.1.: TCPwn testing environment.
35
Algorithm 3 Attack strategy categorization algorithm: 20 benign experiments arefirst executed to obtain a baseline average and standard deviation. Since each strategytransfers the same 100MB file, an strategy is categorized as a function of its totaltransfer time and the baseline average and standard deviation transfer time.
Input: Strategy execution metrics
Output: The category of the strategy
function CategorizeAttackStrategy(s)if s.T ime > (s.T imeAvg + 2 ⇤ T imeStddev) then
return SLOWER
else if s.T ime < (T imeAvg + 2 ⇤ T imeStddev) then
if s.SentData >= (0.7⇤100MB) then
return FASTER
else
return STALLED
end if
else
return BENIGN
end if
end function
and the background flow is measured to investigate the impact of competing flows
during attack, e.g.,, to determine if there was collateral damage.
5.1.2 Attack Strategies
We used TCPwn to execute the 30,297 attack strategies generated by TCPwn to
manipulate BBR’s sending rate. After each attack strategy is executed, it is classified
into one of four categories: faster, slower, stalled (data transfer did not complete),
and benign (no attack was detected). These categories represent the BBR’s respectful
sending rate performance. The attack categorization algorithm is the same as the one
used in [3] and is shown in Algorithm 3. At a high-level, the categorization algorithm
compares BBR’s average sending rate during the connection to the baseline average
and standard deviation (without an attack). If there is a large enough di↵erence,
then the strategy is flagged as an attack.
36
5.1.3 Attack Analysis
After all 30,297 attack strategies were executed, 8,859 were flagged as potential
attacks: 14 faster, 4,025 slower and 4,820 stalled. We initially focused on extracting
the strategies that were most e↵ective at manipulating BBR’s sending rate in each
category. We grouped attack strategies in each category (disregarding attack specific
parameters) and sorted each by average sending rate. This allowed us to quickly find
which attack strategies were most e↵ective at impacting BBR’s performance.
While this allowed us to understand which exact strategies were most e↵ective,
the limitation with this method is that it does not reveal if any subset of actions
is more e↵ective over others. Take for instance the following attack strategy that
hypothetically a↵ects sending rate performance:
[(StateA,Action1),(StateB,Action2),(StateC,Action3)]
While it is true that this attack strategy a↵ects sending rate performance, the
above method does not indicate if performing Action2 in StateB was necessary for
causing it. To find which actions were most e↵ective, we generated all possible attack
action subset combinations for each category and sorted them by their occurrence in
the original attack strategies. This allowed us to see which attack actions the above
method were most e↵ective in each category.
For attack strategies causing faster send rates, the optimistic acknowledgment at-
tack appeared in 100% of the strategies. For attack strategies causing slower send
rates, executing the delayed acknowledgment attack in ProbeBW appeared in 53% of
the strategies, while attacks causing RTOs and quickly acknowledging data in expo-
nential backo↵ appeared in 47% of the strategies. As for the attack strategies causing
a stalled connection, attacks causing RTOs and preventing data from being acknowl-
edged in exponential backo↵ appeared in 89% of the strategies. Finally, attacks that
optimistically acknowledged lost data appeared in 11% of the strategies.
37
Figure 5.2.: This CDF represents the distribution of the average sending rate for eachattack class executed 100 times. The victim BBR flow shares the network with anidentical benign background CUBIC flow.
Time (ms) Mbit/sec Time (ms) Mbit/sec
0 6.0 560 28.6
80 7.5 640 35.7
160 9.4 720 44.7
240 11.7 800 55.9
320 14.6 880 69.8
400 18.3 960 87.3
480 22.8 1040 109.1
Table 5.2.: The optimistic acknowledgment attack causes BBR to increase its sendingrate by 25% every 8 round-trips. In this example, this attack e↵ectively cuts theperceived RTT of 20 ms in half.
5.2 Discovered Attacks on BBR
In this section, we describe each attack that was discovered that a↵ects BBR’s
sending rate. Each attack is most e↵ective from the point-of-view of an on-path
38
sender receiverattacker
DATA 1
DATA 1DATA 2
DATA 2
ACK 1
ACK 2 ACK 2
DATA 3DATA 4DATA 5DATA 6
DATA 3DATA 4
ACK 3
ACK 6
DATA 5DATA 6
ACK 4
ACK 5
ACK 6
Figure 5.3.: This timeline exemplifies the optimistic acknowledgments attack. Theattacker keeps track of the highest data sequence sent by the sender and modifiesacknowledgments from the receiver such that they acknowledge the highest sent se-quence. If a duplicate acknowledgment (after modification) would be sent, then it isdropped.
attacker who has read/write access to the victim flow. Some attacks can also be
achieved by an o↵-path attacker who has write only access to a flow. The greatest
limitation of o↵-path attackers is that they must learn about the sender’s sequence
number state.
5.2.1 Attack 1 – Optimistic Acknowledgments
A faster sending rate is caused by optimistically acknowledging only new data
sequences sent by the sender, without sending duplicates, e↵ectively causing BBR
to overestimate the bottleneck bandwidth. Figure 5.3 shows how this attack mod-
ifies acknowledgments from the receiver to mislead BBR about network conditions.
The attacker records the highest observed data sequence number sent by the sender
and modifies acknowledgment numbers from the receiver such that they acknowledge
39
the highest sequence. If the modified acknowledgment would send a duplicate ac-
knowledgment, then it is dropped. This results in packet loss (indicated by duplicate
acknowledgments) to be hidden. An o↵-path attacker who is able to predict the
sender’s sequence number and the receiver’s acknowledgment rate would be able to
achieve the same a↵ect on a victim flow by maliciously injecting acknowledgments
such that they acknowledge new data sent by the sender.
A faster sending rate is a result from how BBR aggressively probes for additional
bandwidth by sending 25% faster than BtlBw for 1/8 RTTs. This attack causes the
acknowledgment rate to reflect the increased sending rate. This is reflected in the
delivery rate samples (see Algorithm 1) causing the estimated bottleneck bandwidth
to increase by 25% as well. BBR maintains the increased sending rate for the next 8
round-trips until it sends 25% faster again on the next probing phase. Surprisingly,
we discovered that this attack alone does not cause the sending rate to increase. Even
though this attack caused the acknowledgment rate to increase (due to a shortened
ack elapsed), delivery rate samples are capped by the sending rate. It is not until
BBR probes for bandwidth when this attack becomes e↵ective, meaning it is self-
induced. This attack also halves RTprop because data is acknowledged sooner causing
BBR to cycle through the 8 gain phases twice as fast. In our testing environment, this
attack caused BBR to increase its sending rate from 6 Mbit/sec [15] to 800 Mbit/sec
in less than 2 seconds! Table 5.2 shows how BBR’s sending rate exponentially grows
in our testing environment.
We use time/sequence graphs (TSGs) to understand why these attacks a↵ect
BBR’s sending rate on a per-ACK basis. In each TSG, the x-axis represents time
and the y-axis represents send/acknowledged bytes from the BBR sender’s (victim’s)
point-of-view. We use TSGs to also understand how BBR flows targeted by these
attacks compare to benign flows (flows without attacks taking place) and how they
a↵ect background flows that share the network concurrently. Last, Figure 5.2, shows
the impact of executing each attack 100 times, using a CDF of the average send rate
40
that the victim flow achieved during each execution. Note that most curves are nearly
vertical, indicating that each attack had a high probability of a↵ecting the sending
rate.
Figure 5.4 illustrates how the optimistic acknowledgement attack grows the send-
ing rate quickly at the start of a connection as BBR probes probes for bandwidth. In
Figure 5.5, the background flow is unable to transmit any data while the victim flow
is sending at 800 Mbits/sec, leaving no available bandwidth for competing flows.
41
Figure 5.4.: This time-sequence graph illustrates how the optimistic acknowledgmentsattack causes the ACK rate to equal BBR’s increased sending rate. This misleadsBBR into believing the network can successfully sustain the increased sending rate.
Figure 5.5.: This time-sequence graph aims to understand how the victim flow (blue)compares to the benign (teal) and background (orange) flows, all sending a 100MBfile. This figure also aims to understand how this attack a↵ects a background flowbeing executed concurrently and sharing the same network with the victim flow. Dueto its increasing sending rate, the victim flow completes quickly in under 2 seconds.Because the victim flow obtained nearly all of the network resources, the backgroundflow was starved from any bandwidth. The cluster of red x-markers on backgroundflow (during the victim’s transmission) represent data being re-transmitted becauseit was dropped by the network. Once the victim flow completes, the background flowis able to successfully complete.
42
sender receiverattacker
DATA 1
DATA 1
ACK 1
ACK 1
delay
DATA 2
DATA 2
ACK 2
ACK 2
delay
Figure 5.6.: This timeline exemplifies the delayed acknowledgments attack. Theattacker delays each acknowledgments from reaching the receiver for a fixed amountof time.
5.2.2 Attack 2 – Delayed Acknowledgments
A slower sending rate is caused by delaying acknowledgment packets from reaching
the sender for a fixed amount of time, causing the bottleneck bandwidth to be un-
derestimated. Interestingly, this attack caused BBR’s sending rate to grow inversely
to the optimistic acknowledgments attack. Figure 5.9 shows how BBR’s sending rate
grows in O( 1
n·ln(delay)) = O( 1n) time (derivative of logdelay
n). In general, longer delays
(that do not cause the re-transmission timer to expire) caused BBR to decrease its
sending rate quicker. The amount of data sent over time is not to be confused with
the rate of change of its sending rate, hence the derivation.
This attack causes BBR’s sending rate is sequentially decrease over time because
when this attack first starts, the sender experiences an initial delay in acknowledg-
ments (equal to the however long ACKs are delayed for) after which acknowledgments
arrive at their natural rate. This causes the sender to stop sending new data until
the ACKs after the delay arrive. Since the attacker stops sending data, this causes
the sender to experience another delay in acknowledgment packets after one RTT.
This creates a pattern where the sender experiences delays in acknowledgments ev-
43
ery RTT, meaning regardless when delivery rate samples are taken, the plateaus in
acknowledgments cause the delivery rate samples to always be less than the current
BtlBw. This implies BBR will never increase its sending rate because as older BtlBw
estimates expire after 10 RTTs, it is replaced with the decreased delivery rate samples
due to this attack.
Surprisingly, BBR’s probing phase does not mitigate the e↵ect this attack has
on the sending rate. One would think that when BBR probes for bandwidth, the
delivery rate samples computed during probing would be large enough to surpass the
decreased delivery rate samples. This would be true, however because BBR drains
queues immediately following probing, the delivery rate samples take during probing
are still less than the current BtlBw. If BBR sent at a steady pace for at least 1 RTT
in between probing and draining, then this attack would not result in its sending rate
to decay.
Figure 5.9 shows how this attack is less e↵ective on congestion control that uses
AIMD such as TCP New Reno. Although this attack is still e↵ective on New Reno,
its sending rate maintains a constant decreased rate rather than decaying.
It should be noted that this attack does require TCP header information to be
modified to be e↵ective, as packets are only delayed. This implies QUIC [2], Google’s
experimental transport layer protocol that uses encrypted headers, using BBR can
be targeted by this attack.
44
Figure 5.7.: This time-sequence graph illustrates how the delayed acknowledgmentattack causes BBR to sequentially decrease its sending rate every 10 RTTs. Thisphenomenon is due to how BBR retains the current BtlBw estimate and how thisattack causes delivery rate samples to always be less than the current BtlBw.
Figure 5.8.: This time-sequence graph aims to understand how the victim flow (blue)compares to the benign (teal) and background (orange) flows, all sending a 100MBfile. The victim flow can be seen transmitting data very slowly where the backgroundflow is able to obtain a slightly greater bandwidth than in the benign case.
45
Figure 5.9.: To understand how delayed ACKs a↵ect the sending rate of BBR andNew Reno di↵erently, this figure illustrates several time-sequence graphs, each with adistinct delay time. Each line represents the cumulative amount of data sent duringthe connection. Because of di↵erent congestion control techniques, delayed acknowl-edgments cause BBR’s sending rate to decay over time, where New Reno maintainsa rate.
5.2.3 Attack 3 – Repeated RTO
A slower sending rate is caused by an attacker who allows small amounts of data
to be sent in between repeated re-transmission timeouts. This causes a slower send-
ing rate because from when the attack first begins to when the RTO expires, the
46
sender receiverattacker
DATA 1
DATA 1
ACK 1RTO
dropped
DATA 1
DATA 1
RTO
sender receiverattacker
DATA 1
DATA 1
ACK 1RTO
DATA 2
DATA 2
ACK 2
ACK 1
ACK 1
DATA 2
DATA 2
start
sender receiverattacker
DATA 1
DATA 1DATA 2
DATA 2DATA 3
DATA 3DATA 4
DATA 4ACK 1
ACK 2
ACK 3
ACK 4ACK 2
ACK 4
RTO
DATA 1DATA 1
sender receiverattacker
DATA 1
DATA 1
ACK 1RTO
ACK 1
delayDATA 1
DATA 1
Figure 5.10.: These timelines illustrates the four acknowledgment-based manipulationactions that cause the sender to re-transmission timeout: dropping (top left), limiting(top right), stretching (bottom left) and delaying acknowledgments (bottom right).
sender does not send any new data, which repeated throughout the duration of the
transmission.
This attack begins by causing the sender to re-transmission timeout, which is
achieved by preventing new data from being acknowledged. There are four acknowledgment-
based manipulation actions that were found to cause this: dropping, limiting, stretch-
ing and delaying acknowledgments, all illustrated in Figure 5.10. Dropping acknowl-
edgments simply consists of preventing ACKs from being delivered to the sender.
By limiting acknowledgments, acknowledgment numbers are such that they equal
min(ack, limit). Stretching acknowledgments consists of forwarding only every nth
acknowledgment to the receiver. Lastly, delaying acknowledgments (also used in at-
tack 2) consists of delaying acknowledgments from reaching the receiver.
47
sender receiverattacker
DATA 1
DATA 1
ACK 1RTO
DATA 1
DATA 1
ACK 1
ACK 1
DATA 2DATA 3 DATA 2
DATA 3
ACK 2
ACK 3
RTO
DATA 2DATA 3 DATA 2
DATA 3
ACK 2
ACK 3
ACK 2
ACK 3
dropped
dropped
exponential backoff
startup
spuriousretransmissions
spuriousretransmissions
exponential backoff
Figure 5.11.: This timeline illustrates one way the repeated re-transmission timeoutattack can be achieved. In this timeline, re-transmission timeouts are achieved bydropping acknowledgments, which can also be achieved by limiting, stretching ordelaying acknowledgments (see Figure 5.10). This attack repeatedly causes the senderto RTO, causing a slower sending rate.
After the re-transmission timeout is achieved from one of the above methods,
the sender enters exponential backo↵ and the attacker optimistically acknowledges
some data and repeats the process. The purpose here is to quickly acknowledge the
lost segment in order to case BBR enter Startup. When BBR transitions from into
Startup, its network path model (BtlBw and RTprop) is discarded, meaning the model
48
must be rediscovered after each RTO. This attack prevents BBR from obtaining an
opportunity to send data anywhere near the optimal operating point, resulting in
decreased throughput. Instead, data is sent in bursts with lengthy idling in between.
Figure 5.12 illustrates these bursts and how the connection idles until the timeout.
At 0 ms, BBR is in Startup, which is where the attacker begins to drop, limit, stretch
or delay ACKs. As a result, the sender stops sending data because in flight has
reached cwnd. When the timeout occurs around 800 ms, the attacker optimistically
acknowledges the lost segment, causing Startup to be reentered, after which the attack
repeats. In Figure 5.13, the victim flow can be seen experiencing timeouts throughout
the entire flow (indicated by the red x-markers), allowing the background flow to
obtain more bandwidth.
Interestingly, the work in [3] reported a similar attack resulted in a faster sending
rate in several cases. They note how the idle periods in between timeouts is out-
weighed by repeatedly entering slow start (where cwnd doubles on each ACK). This
was not the case in our work, however, most likely due to how we induce timeouts
the very instant BBR enters Startup, resulting in less time for the sending rate to
double.
49
Figure 5.12.: This time-sequence graph illustrates how the repeated re-transmissiontimeout attack taking place on a victim BBR flow. The gaps between data beingsent represent the time between when the attacker begins to limit or stretch ACKs towhen the sender’s re-transmission timer expires. During this time, the sender sendsno new data, leaving it in an idle state. This repeats during the entire transmission,causing the slowed sending rate.
Figure 5.13.: This time-sequence graph aims to understand how the victim flow (blue)compares to the benign (teal) and background (orange) flows, all sending a 100MB file.The sender can be seen re-transmitting segments throughout the entire transmission(indicated by the red x-markers) due to this attack. The background flow is able toobtain slightly more bandwidth made available from the slowed victim flow.
50
sender receiverattacker
DATA 1
DATA 1
ACK 1RTO
dropped
exponential backoff
spuriousretransmissions
DATA 1
DATA 1
ACK 1
dropped
RTO
drop, limit,stretch ordelay
drop or limit
Figure 5.14.: This timeline illustrates one way the re-transmission timeout stall at-tack can be achieved. This attack works in two stages: cause the sender to RTO,then prevent the lost segment from being acknowledged. In this timeline, the re-transmission timeout is achieved by dropping acknowledgments, which can also beachieved by limiting, stretching or delaying acknowledgments (see Figure 5.10). After-wards, the attacker prevents the lost segment from being acknowledged by droppingacknowledgments, which can also be achieved by limiting acknowledgments.
5.2.4 Attack 4 – Re-transmission Timeout Stall
In this attack, a stalled connection is caused by an attacker who causes BBR to
enter exponential-backo↵, and prevents it from ever exiting. The attack begins by
causing the sender to timeout by performing any of the actions found in Figure 5.10:
by dropping, limiting, stretching or delaying acknowledgments. After the timeout
and when exponential backo↵ is entered, the attacker prevents any new data from
being acknowledged, which can be accomplished by limiting or dropping acknowledg-
ments. This causes BBR to permanently remain in exponential backo↵ because the
re-transmitted segment will never be acknowledged. Additionally, no new data will
ever be sent because in flight reached cwnd, e↵ectively stalling the connection. The
lost segment will be re-transmitted 15 times (Linux default) with a doubled timeout
time in between each re-transmit. On the 16th re-transmission, the TCP connection
would be torn down by the sender (at least 15 minutes, 24 seconds from the first
re-transmission).
51
In Figure 5.15, the connection stalls around 700 ms after only sending about 2.5
MB worth of data. In Figure 5.16, the background flow is able to obtain greater
bandwidth made available by the victim stalling. It is important to note that this
attack is highly flexible as it can be applied at any time during a connection and is
not limited to a specific state or time.
52
Figure 5.15.: This time-sequence graph illustrates the re-transmission stall attacktaking place on a victim BBR flow. The ACKs plateau when this attack first starts,causing new data to never be acknowledged, causing the sender to stall.
Figure 5.16.: This time-sequence graph aims to understand how the victim flow (blue)compares to the benign (teal) and background (orange) flows, all sending a 100MB file.The victim flow stalls immediately when the attacker begins to limit or drop ACKsand keeps re-transmitting the first unacknowledged segment, doubling its timeouttime in between each re-transmission, indicated by the red x-markers.
53
sender receiverattacker
DATA 2
ACK 2
RTO
DATA 2DATA 3DATA 3
ACK 3
DATA 4DATA 4
ACK 2
ACK 2
DATA 4DATA 4
ACK 2
ACK 2
optimisticallyACK lost data
Figure 5.17.: This timeline illustrates the sequence number desync stall attack. First,the attacker optimistically acknowledges a lost packet that the receiver did not re-ceive. This causes the sender to send the next packet that will be out-of-order fromthe receiver’s point-of-view. This causes the receiver to send duplicate acknowledg-ments. To prevent the sender from re-transmitting the lost packet, the attack injectsacknowledgments to prevent the sender from receiving three duplicate acknowledg-ments in-a-row.
5.2.5 Attack 5 – Sequence Number Desync
In this attack, a stalled connection is caused by an attacker who optimistically
acknowledges lost data causing sequence numbers between the sender and receiver to
de-synchronize. This attack works simply by acknowledging a lost segment (that was
not actually delivered to the receiver). The primary reason why this attack causes
a stalled connection is because the sender is unable to re-transmit the lost segment
because it was removed from the “re-transmission queue”. The TCP write queue
retains segments until they have been acknowledged. As segments are acknowledged,
they are discarded from the queue in order to free memory. By optimistically ac-
knowledging a lost segment, the sender calls tcp clean rtx queue() which discards
the lost segment from the write queue because the sender believes it has no need to
retain it any longer.
54
The sender then proceeds to send the next data segments, which will be delivered
out-of-order from the receiver’s point of view, meaning the receiver will respond by
acknowledging the highest correct data segment received so far. Since the out-of-order
segment the sender sends cannot be acknowledged, it will keep being re-transmitted
eventually causing three duplicate acknowledgments to be sent by the receiver. When
this occurs, the sender will try to re-transmit the lost segment but cannot because it
has been removed from the re-transmission queue.
In Figure 5.18, the connection stalls because while the receiver sends duplicate
acknowledgments (around the 0.2 MB mark), the sender keeps re-transmitting the
same segment (around the 0.4 MB mark) because its re-transmission timer keeps
expiring. Since the receiver cannot receive that segment because it is out-of-order, it
cannot acknowledge it, causing a stalled connection. In Figure 5.19, the background
flow can be seen slight increasing its sending rate because when the victim’s connection
stalled, more bandwidth is made available, allowing the background flow to send
faster. This attack was discovered in [3] for TCP New Reno which also resulted in a
stalled connection.
Although this attack is most e↵ective from an on-path attacker, this attack could
be achieved by an o↵-path attacker who successfully learns the victim flow’s sequence
number state. If an o↵-path attacker successfully crafts and injects an acknowledg-
ment packet acknowledging a lost segment, then a stalled connection would result.
55
Figure 5.18.: This time-sequence graph illustrates the sequence number desync attacktaking place on a victim BBR flow. By acknowledging a lost segment, the sequencenumbers between the sender and receiver become desynchronized, causing the senderto keep re-transmitting an out-of-order segment.
Figure 5.19.: This time-sequence graph aims to understand how the victim flow (blue)compares to the benign (teal) and background (orange) flows, all sending a 100MB file.The victim flow stalls because this attack causes the sender to keep re-transmittingan out-of-order segment that the receiver cannot acknowledge. The sender keeps re-transmitting the segment, doubling its timeout time between each re-transmission,which is illustrated by the red x-markers.
56
sender receiverattacker
DATA 1
DATA 1DATA 2
DATA 2DATA 3
ACK 1DATA 3
ACK 2
ACK 3
ACKs 1, 2 & 3
ACK 1
ACK 2
normal ACKs
bursted ACKs
Figure 5.20.: This timeline illustrates the acknowledgment burst attack. The attackeraccumulates n = 3 acknowledgments and forwards them to the sender in a singleburst.
5.3 Ine↵ective Congestion Control Attacks Against
BBR
In this section, we describe how some previously known attacks against conges-
tion control were ine↵ective against BBR. We describe how BBR is immune to the
acknowledgment burst, division and duplication attacks.
5.3.1 Acknowledgment Bursts
In this attack, the attacker accumulates n acknowledgment packets from the re-
ceiver before forwarding them to the sender in a single burst. In [3], this attack
caused New Reno to send data in bursts because TCP is ACK-clocked meaning its
sending behavior closely mimics the acknowledgment behavior. In BBR, acknowledg-
ment bursts do not cause data to be sent in bursts because even though the delivery
rate samples computed for the first n� 1 ACKs in the burst are deflated, the deliv-
ery rate sample for the nth (last) ACK in the burst is no di↵erent than without an
attack. Figure 5.20 illustrates why this is the case. ACKs 1 and 2 arrive later than
57
sender receiverattackerDATA (0:1500)
ACK (1500)
DATA (0:1500)
ACK (500)
ACK (1000)
ACK (1500)spoofed
ACKs
normal ACK
Figure 5.21.: This timeline illustrates the acknowledgment division attack. The at-tacker acknowledges the sequence space in steps.
normal, meaning ack elapsed is increased, which deflates their delivery rate sam-
ples. However, ACK 3 arrives normally, even with the attack taking place, meaning
its delivery rate sample is unchanged. Since the delivery rate samples for the first
n � 1 ACKs are always less than the nth (last) ACK, the ACKs arrive at the same
time and larger delivery rate samples take precedence, this attack does not impact
BBR’s BtlBw estimate.
5.3.2 Acknowledgment Division
In this attack, a single ACK acknowledging m bytes is divided into n valid ACKs
each acknowledging roughly m/n bytes. In [6], this attack caused cwnd to grow n
times as fast because for each ACK, cwnd increased by one segment. Depending
on how the attacker injected the divided ACKs, this attack had no e↵ect on BBR’s
sending rate for di↵erent reasons. If the attacker sent the divided ACKs at the
same time as the valid ACK (the ACK being divided), then the attack would not be
ine↵ective for the exact same reason as to why the acknowledgment burst attack did
not a↵ect BBR’s sending rate (Section 5.3.1). If the attacker sent the divided ACKs
at the same time as the last ACK (the ACK before the one that is being divided),
58
sender receiverattacker
DATA 1
ACK 1
ACK 1
DATA 1
ACK 1
ACK 1
ACK 1spoofedACKs
normal ACK
Figure 5.22.: This timeline illustrates the duplicate acknowledgment attack. Theattacker injects n = 3 duplicate acknowledges into the victim stream.
then the ACK rate would be clamped by BBR’s current sending rate. If the attacker
performed this during BBR’s probing phase, then it would be identical to attack
1 where the sender’s sending rate is increased. If the attacker evenly spaced each
divided ACK, then the ACK rate of the divided ACKs would be no di↵erent than
the ACK rate without the attack.
5.3.3 Duplicate Acknowledgments
In this attack, n duplicate acknowledgments are injected for every acknowledgment
packet from the receiver. In congestion control schemes such as CUBIC and New Reno
that use packet loss to detect congestion, when � 3 duplicate acknowledgments were
injected, Fast Recovery was entered to re-transmit the lost segment. This caused a
decreased sending rate because upon entering Fast Recovery, the congestion window
is halved causing data to be sent at a slower rate. Although BBR is not loss-based,
it still includes a mechanism for dealing with packet loss (by detecting duplicate
acknowledgments) by entering a Recovery state. The reason this attack is not e↵ective
against BBR is because BBR does not backo↵ from the network upon packet loss.
59
Instead, BBR sets cwnd to in flight and re-transmits the lost segment until all
outstanding data when Recovery was entered is acknowledged.
60
6 Related Work
6.1 BBR Congestion Control Analysis
Prior e↵ort has gone into analyzing the Linux TCP BBR congestion control v1.0
implementation. The work in [34] evaluates BBR’s throughput, queue delay, packet
loss and fairness with varying RTT, flow count and bottleneck bu↵er sizes. This
work found that when multiple BBR flows share the bottleneck, BBR tended to
overestimate the bottleneck leading to high packet loss and latency. Ideally, a network
with a bottleneck bandwidth br and a BBR flow i having an observed maximum
delivery rate sample cbri , all flows sharing the bottleneck shall observe a maximum
delivery rate sample such thatP
icbri = br. It was shown that
Picbri > br, explained
by BBR’s aggressive probing and how BBR retains the maximum delivery rate sample
for 10 RTTs. This work also showed that flows with larger RTTs tended to obtain
larger sending rates. This is due to how BBR’s inflight cap is set to 2 · RTTmin ·cbri ,
implying flows with larger RTTs will allow larger amounts of data inflight than those
with smaller RTTs. Lastly, this work showed that when the bottleneck switch queue
bu↵er is smaller than the path’s BDP, flows tended to experience “massive packet
loss” because the bottleneck bu↵er overflowed before in flight reached the BDP.
The e↵ort in [35] also showed that flows with naturally longer RTTs often obtained
a greater throughput than those with shorter RTTs, irrespective of the bottleneck
bandwidth. Their reasoning for this was similar to [34]: flows with longer RTTs place
larger amounts of data into the network, e↵ectively dominating queue backlogs and
delivery rates. This work proposes BBQ: a solution to BBR’s RTT unfairness by
61
detecting excess queue buildup and limiting long RTT flows from starving short RTT
flows by limiting excessive amounts of data to be sent into the network.
Similar to [34, 35], the work in [36] reported that BBR favors flows in naturally
longer RTTs. It was shown how BBR insu�ciently drains queues while in ProbeRTT,
meaning its RTprop is overestimated meaning BBR’s in flight cap is larger than it
should be.
6.2 Congestion Control Attacks
Since TCP security lies in random initial sequence numbers, many attacks have
long existed specifically targeting TCP congestion control. The work in [6] demon-
strates how a misbehaving receiver can undermine congestion control making senders
sent data at a faster pace without compromising reliability. It is shown how TCP
is susceptible to divided, duplicate and optimistic acknowledgment attacks. With
divided acknowledgments, the receiver splits single acknowledgment packets into n
acknowledgments, each acknowledging 1/n the data the original packet would have,
causing cwnd to increase more quickly. With duplicate acknowledgments, the receiver
sends n duplicate acknowledgments instead of a single acknowledgement. This attack
caused the sender to enter fast recovery and to send data in large bursts. By injecting
optimistic acknowledgment packets, the receiver was able to cause cwnd to increase
more quickly by acknowledging data it hadn’t received yet.
Much work has gone into o↵-path attackers who have write-only access to a flow.
Sequence numbers can be predicted [7–9, 37–40] to inject malicious content into a
victim’s connection. The work in [7] shows how sequence numbers can be leaked
to unprivileged, on-device malware to coordinate with an o↵-path attacker, yielding
connection hijacking in under one second. The work in [37] aims for better initial
sequence number generation to make it more di�cult for o↵-path attackers to succeed.
62
6.3 Protocol Fuzzing
Program analysis by automatically generating inputs has long been used to test
for security, robustness and reliability. Instead of generating random inputs, the
work in [41] takes an approach by generating relevant tests tailored to all possible
source code paths. Similar approaches have been used for network protocol analysis.
MAX [4] discovers attacks in network protocols however requires source code to be
annotated where vulnerabilities are likely to exist, yielding thorough manual analysis.
This motivated model-guided testing [3,42,43] where a protocol’s state machine is used
to discover relevant attacks which has been applied to a variety protocols. KiF [44],
SNOOZE [45] and SNAKE [46] all take model-guided approaches to discover relevant
and e↵ective attack strategies in network protocols. SNAKE, though, operates based
on network tra�c alone and does not require a specific operating system or source
code. The limitation with SNAKE is that it is unable to inject di↵erent attacks based
on the protocol’s current state.
TCPwn [3] takes a model-guided approach for discovering acknowledgement-
based manipulation attacks in TCP congestion control implementations. Similar to
SNAKE, TCPwn does not require source code or a specific operating system. Con-
trary to SNAKE, TCPwn is able to inject di↵erent attacks based on the protocol’s
state which allows for increased manipulation capability, which becomes especially
useful for complex protocols. As discussed in Section 4 TCPwn can not be directly
applied to BBR.
63
7 Conclusion
TCP congestion control is the mechanism that controls that rate at which data
is sent into a network so as to achieve full network utilization, bandwidth fairness
and preventing congestion collapse. BBR is a new congestion control algorithm that
reevaluates whether packet loss is the optimal signal for congestion. Instead, BBR
paces data delivery based on the path’s bottleneck bandwidth and round-trip time.
In this thesis, we take a model-guided approach to discover relevant acknowledgment-
based manipulation attacks in the Linux TCP BBR v1.0 implementation. To accom-
plish this, we created the first detailed BBR finite-state machine (the model) in liter-
ature in order to discover theoretical attacks targeting its sending rate. To emulate
a realistic attacker and to achieve our multi-staged attack strategies, we developed a
novel algorithm to infer the current BBR state in real-time based on network tra�c
alone, allowing acknowledgment manipulation actions to be executed at the correct
times to mount successful attacks. This approach generated 30,297 attack strategies,
of which 8,859 caused BBR to send data at erroneous rates with respect to actual
network conditions.
We identified 5 classes of attacks that caused BBR to send data a high, slow and
stalled rates. We discovered that due to how BBR multiplicatively probes for band-
width, an on/o↵-path attacker who optimistically acknowledges data caused BBR to
increase its sending rate by 13x in under 1 second. We also discovered how the com-
bination of gain cycling and delayed acknowledgments by an on-path attacker caused
BBR to not only decrease, but sequentially decrease its sending rate over time. We
showed how an on-path attacker caused BBR to decrease its sending rate by se-
quentially preventing new data from being acknowledged, causing re-transmission
64
timeouts and for BBR to reset and rediscover the network path model each time. We
also showed how an on-path attacker who prevents new data from being acknowl-
edged caused BBR to permanently enter exponential backo↵, e↵ectively stalling data
transmission. We showed how an on/o↵-path attacker can optimistically acknowledge
lost data causing sequence numbers to desynchronize leading to a stalled connection.
Lastly, we showed how the bursted, divided and duplicated acknowledgment attacks
that were previously e↵ective against prior congestion control scheme were not e↵ec-
tive against BBR because of its new design of congestion control.
65
References
[1] University of Southern California Information Sciences Institute. TransmissionControl Protocol. https://tools.ietf.org/html/rfc793, 1981.
[2] J. Iyengar, Ed and Fastly and M. Thomas, Ed and Mozilla. QUIC: A UDP-BasedMultiplexed and Secure Transport. https://tools.ietf.org/html/draft-ietf-quic-transport-18, 2019.
[3] Samuel Jero, Endadul Hoque, David Cho↵nes, Alan Mislove, and Cristina Nita-Rotaru. Automated Attack Discovery in TCP Congestion Control Using a Model-guided Approach. In Proc. of Network & Distributed System Security Symposium(NDSS), 2018.
[4] Nupur Kothari, Ratul Mahajan, Todd Millstein, Ramesh Govidan, and MadanlalMusuvathi. Finding Protocol Manipulation Attacks. In SIGCOMM, pages 26–37,2011.
[5] Laurent Joncheray. A simple active attack against TCP. In USENIX SecuritySymposium, 1995.
[6] Stefan Savage, Neal Cardwell, David Wetherall, and Tom Anderson. TCP Con-gestion Control with a Misbehaving Receiver. ACM SIGCOMM Computer Com-munication Review, 29(5), 1999.
[7] Zhiyun Qian, Z. Morley Mao, and Yinglian Xie. Collaborative TCP sequencenumber inference attack: how to crack sequence number under a second. InACM Conference on Computer and Communications Security, 2012.
[8] Yue Cao, Zhiyun Qian, Zhongjie Wang, Tuan Dao, Srikanth V. Krishnamurthy,and Lisa M. Marvel. O↵-path TCP Exploits: Global Rate Limit Considered Dan-gerous. In Proceedings of the 25th USENIX Conference on Security Symposium,SEC’16, pages 209–225, Berkeley, CA, USA, 2016. USENIX Association.
[9] Yossi Gilad and Amir Herzberg. O↵-path Attacking the Web. In Proceedings ofthe 6th USENIX Conference on O↵ensive Technologies, WOOT’12, pages 5–5,Berkeley, CA, USA, 2012. USENIX Association.
[10] N. Cardwell, J. Priyaranjan, E. Dumazet, K. Yang, D. Miller, and Y. Se-ung. Linux TCP BBR. https://git.kernel.org/pub/scm/linux/kernel/git/davem/net-next.git/tree/net/ipv4/tcp_bbr.c, 2018.
[11] Neal Cardwell, Yuchung Cheng, C. Stephen Gunn, Soheil Hassas Yeganeh, andVan Jacobson. BBR: Congestion-Based Congestion Control. ACM Queue, 2016.
66
[12] Neal Cardwell, Yuchung Cheng, Soheil Hassas Yeganeh, and Van Jacobson. BBRCongestion Control. https://tools.ietf.org/id/draft-cardwell-iccrg-bbr-congestion-control-00.html, 2017.
[13] Neal Cardwell, Yuchung Cheng, Soheil Hassas Yeganeh, and Van Jacobson. De-livery Rate Estimation. https://tools.ietf.org/html/draft-cheng-iccrg-delivery-rate-estimation-00, 2018.
[14] Neal Cardwell, Yuchung Cheng, C. Stephen Gunn, Soheil Hassas Yeganeh,and Van Jacobson. BBR Congestion Control. https://www.ietf.org/proceedings/97/slides/slides-97-iccrg-bbr-congestion-control-02.pdf, November 2016.
[15] Neal Cardwell, Yuchung Cheng, C. Stephen Gunn, Soheil HassasYeganeh, and Van Jacobson. BBR Congestion Control: An Update.https://www.ietf.org/proceedings/98/slides/slides-98-iccrg-an-update-on-bbr-congestion-control-00.pdf, March 2017.
[16] Neal Cardwell, Yuchung Cheng, C. Stephen Gunn, Soheil Hassas Yeganeh, IanSwett, Jana Iyengar, Victor Vasiliev, and Van Jacobson. BBR CongestionControl: IETF 99 Update. https://www.ietf.org/proceedings/99/slides/slides-99-iccrg-iccrg-presentation-2-00.pdf, July 2017.
[17] Neal Cardwell, Yuchung Cheng, C. Stephen Gunn, Soheil HassasYeganeh, Ian Swett, Jana Iyengar, Victor Vasiliev, and Van Jacobson.BBR Congestion Control: IETF 100 Update: BBR in shallow bu↵ers.https://datatracker.ietf.org/meeting/100/materials/slides-100-iccrg-a-quick-bbr-update-bbr-in-shallow-buffers, November 2017.
[18] Neal Cardwell, Yuchung Cheng, C. Stephen Gunn, Soheil Hassas Yeganeh,Ian Swett, Jana Iyengar, Victor Vasiliev, Priyaranjan Jha, Yousuk Seung,and Van Jacobson. BBR Congestion Control Work at Google: IETF 101Update. https://datatracker.ietf.org/meeting/101/materials/slides-101-iccrg-an-update-on-bbr-work-at-google-00, March 2018.
[19] Neal Cardwell, Yuchung Cheng, C. Stephen Gunn, Soheil Hassas Yeganeh, IanSwett, Jana Iyengar, Victor Vasiliev, Priyaranjan Jha, Yousuk Seung, KevinYang, Matt Mathis, and Van Jacobson. BBR Congestion Control Work atGoogle: IETF 101 Update. https://datatracker.ietf.org/meeting/102/materials/slides-102-iccrg-an-update-on-bbr-work-at-google-00,July 2018.
[20] L Kleinrock. Power and deterministic rules of thumb for probabilistic problemsin computer communications. 01 1979.
[21] V. Jacobsen and LBL and R. Braden and ISI. TCP Extensions for Long-DelayPaths. https://tools.ietf.org/html/rfc1072, October 1988.
[22] Van Jacobson. Congestion Avoidance and Control. ACM SIGCOMM ComputerCommunication Review, 18(4):314–329, 1988.
[23] Defense Advanced Research Projects Agency. Transmission Control Protocol.https://tools.ietf.org/html/rfc793, 1981.
67
[24] Jong Suk Ahn, Peter B. Danzig, Zhen Liu, and Limin Yan. Evaluation ofTCP Vegas: Emulation and Experiment. SIGCOMM Comput. Commun. Rev.,25(4):185–195, October 1995.
[25] Lawrence S. Brakmo, Sean W. O’malley, and Larry L. Peterson. Tcp vegas: Newtechniques for congestion detection and avoidance. In In SIGCOMM, 1994.
[26] David X. Wei, Cheng Jin, Steven H. Low, and Sanjay Hegde. FAST TCP:Motivation, Architecture, Algorithms, Performance. IEEE/ACM Trans. Netw.,14(6):1246–1259, December 2006.
[27] R. Jain. A delay-based approach for congestion avoidance in interconnected het-erogeneous computer networks. SIGCOMM Comput. Commun. Rev., 19(5):56–71, October 1989.
[28] J. Mo, R. J. La, V. Anantharam, and J. Walrand. Analysis and comparison oftcp reno and vegas. volume 3, pages 1556–1563 vol.3, March 1999.
[29] Richard J. La, Jean Walrand, and Venkat Anantharam. Issues in TCP Vegas,2001.
[30] S. Floyd. HighSpeed TCP for Large Congestion Windows. https://tools.ietf.org/html/rfc3649, 2003.
[31] V. A. Kumar, P. S. Jayalekshmy, G. K. Patra, and R. P. Thangavelu. On remoteexploitation of TCP sender for low-rate flooding denial-of-service attack. IEEECommunications Letters, 13(1):46–48, 2009.
[32] Lixia Zhang, Scott Shenker, and Daivd D. Clark. Observations on the Dynamicsof a Congestion Control Algorithm: The E↵ects of Two-way Tra�c. SIGCOMMComput. Commun. Rev., 21(4):133–147, August 1991.
[33] Amit Aggarwal, Stefan Savage, and Thomas Anderson. Understanding the Per-formance of TCP Pacing. Proceedings - IEEE INFOCOM, 01 2000.
[34] Mario Hock, Roland Bless, and Martina Zitterbart. Experimental evaluationof BBR congestion control. In 2017 IEEE 25th International Conference onNetwork Protocols (ICNP), pages 1–10. IEEE, 2017.
[35] Shiyao Ma, Jingjie Jiang, Wei Wang, and Bo Li. Fairness of Congestion-BasedCongestion Control: Experimental Evaluation and Analysis, 2017.
[36] Dominik Scholz, Benedikt Jager, Lukas Schwaighofer, Daniel Raumer, FabienGeyer, and Georg Carle. Towards a Deeper Understanding of TCP BBR Con-gestion Control. In IFIP Networking, 2018.
[37] S. Bellovin. Defending Against Sequence Number Attacks. https://tools.ietf.org/html/rfc1948, 1996.
[38] Robert T. Morris. A Weakness in the 4.2BSD Unix TCP/IP Software, 1985.
[39] Z. Qian and Z. M. Mao. O↵-path TCP Sequence Number Inference Attack - HowFirewall Middleboxes Reduce Security. In 2012 IEEE Symposium on Securityand Privacy, pages 347–361, May 2012.
68
[40] Weiteng Chen and Zhiyun Qian. O↵-Path TCP Exploit: How Wireless RoutersCan Jeopardize Your Secrets. In 27th USENIX Security Symposium (USENIXSecurity 18), pages 1581–1598, Baltimore, MD, 2018. USENIX Association.
[41] Joe W. Duran and Simeon Ntafos. A report on random testing. In Proceedingsof the 5th International Conference on Software Engineering, ICSE ’81, pages179–183, Piscataway, NJ, USA, 1981. IEEE Press.
[42] C. Cho, D. Babic, P. Poosankam, K. Chen, E. Wu, and D. Song. MACE: Model-inference-Assisted Concolic Exploration for Protocol and Vulnerability Discovery.In USENIX Conference on Security, 2011.
[43] Joeri de Ruiter and Erik Poll. Protocol State Fuzzing of TLS Implementations.In 24th USENIX Security Symposium (USENIX Security 15), pages 193–206,Washington, D.C., 2015. USENIX Association.
[44] Humberto J. Abdelnur, Radu State, and Olivier Festor. KiF: A Stateful SIPFuzzer. In Proceedings of the 1st International Conference on Principles, Systemsand Applications of IP Telecommunications, pages 47–56, New York, NY, USA,2007. ACM.
[45] G. Banks, M. Cova, V. Felmetsger, K. Almeroth, R. Kemmerer, and G. Vigna.SNOOZE: Toward a stateful network protocol fuzzer. In International Confer-ence on Information Security, pages 343–358. 2006.
[46] S. Jero, H. Lee, and C. Nita-Rotaru. Leveraging State Information for Auto-mated Attack Discovery in Transport Protocol Implementations. In IEEE/IFIPInternational Conference on Dependable Systems and Networks, 2015.
[47] Centre for the Protection of National Infrastructure. Security assessment of thetransmission control protocol. Technical Report CPNI Technical Note 3/2009,Centre for the Protection of National Infrastructure, 2009.
[48] Miguel Rodrıguez-Perez, Sergio Herrerıa-Alonso, Manuel Fernandez-Veiga, andCandido Lopez-Garcıa. Common Problems in Delay-Based Congestion ControlAlgorithms: A Gallery of Solutions. CoRR, abs/1405.5365, 2014.
[49] Aleksandar Kuzmanovic and Edward Knightly. Low-rate TCP-targeted denial ofservice attacks and counter strategies. IEEE/ACM Transactions on Networking,14(4):683–696, 2006.
[50] Raz Abramov and Amir Herzberg. TCP ack storm DoS attacks. In IFIP Inter-national Information Security Conference, pages 29–40, 2011.
[51] NOAO W. Stevens. TCP Slow Start, Congestion Avoidance,Fast Retransmit,and Fast Recovery Algorithms. https://tools.ietf.org/html/rfc2001, 1997.
[52] E. Rescorla and RTFM, Inc. HTTP over TLS. https://tools.ietf.org/html/rfc2818, May 2000.
[53] S. Floyd. Congestion Control Principles. https://tools.ietf.org/html/rfc2914, 2000.
[54] T. Henderson and S. Floyd and A. Gurtov and Y. Nishida. The NewReno Mod-ification to TCP’s Fast Recovery Algorithm. https://tools.ietf.org/html/rfc6582, April 2012.
69
[55] S. Shalunov and G. Hazel and BitTorrent, Inc. and J. Iyengar and ”Franklinand Marshall College” and M. Kuehlewind and University of Stuttgart. Low Ex-tra Delay Background Transport (LEDBAT). https://tools.ietf.org/html/rfc6817, December 2012.
[56] Internation Business Machines Corporation. FASP. https://asperasoft.com/technology/transport/fasp/, 2019.
[57] J.H. Salzer, D.P. Reed, and D.D. Clark. End-To-End Arguments in SystemDesign. 1981.
[58] J. Song, C. Cadar, and P. Pietzuch. SymbexNet: Testing Network ProtocolImplementations with Symbolic Execution and Rule-Based Specifications. IEEETransactions on Software Engineering, 40(7):695–709, 2014.
[59] Sangtae Ha, Injong Rhee, and Lisong Xu. CUBIC: A New TCP-friendly High-speed TCP Variant. SIGOPS Oper. Syst. Rev., 42(5):64–74, 2008.
[60] S. M. Bellovin. Security Problems in the TCP/IP Protocol Suite. SIGCOMMComput. Commun. Rev., 19(2):32–48, April 1989.
[61] Samuel C. Jero. Analysis and Automated Discovery of Attacks in TransportProtocols, 2018.
[62] Samuel C. Jero. TCPwn: A system for automated discovery of congestioncontrol attacks on TCP implementations, 2018.
[63] U. Hengartner, J. Bolliger, and T. Gross. TCP Vegas Revisited. In ProceedingsIEEE INFOCOM 2000. Conference on Computer Communications. NineteenthAnnual Joint Conference of the IEEE Computer and Communications Societies(Cat. No.00CH37064), 2000.
[64] Jim Gettys and Kathleen Nichols. Bu↵erbloat: Dark Bu↵ers in the Internet.Queue, 9(11):40:40–40:54, November 2011.
[65] Claudio Casetti, Mario Gerla, Saverio Mascolo, M. Y. Sanadidi, and RenWang. Tcp westwood: End-to-end congestion control for wired/wireless net-works. Wirel. Netw., 8(5):467–479, September 2002.
[66] Kun Tan, Jingmin Song, Qian Zhang, and Murari Sridharan. A Compound TCPApproach for High-speed and Long Distance Networks. page 12, July 2005.
[67] Wei Sun, Lisong Xu, Sebastian Elbaum, and Di Zhao. Model-Agnostic and E�-cient Exploration of Numerical State Space of Real-World TCP Congestion Con-trol Implementations. In 16th USENIX Symposium on Networked Systems De-sign and Implementation (NSDI 19), pages 719–734, Boston, MA, 2019. USENIXAssociation.