Upload
tabitha-gilmore
View
214
Download
0
Tags:
Embed Size (px)
Citation preview
History and Internals of History and Internals of TCP/IPTCP/IP
Andrew TuckerAndrew Tucker
February 15, 2000February 15, 2000
What We’ll CoverWhat We’ll Cover
Big picture of network protocolsBig picture of network protocols Where TCP/IP lives in the network layer modelWhere TCP/IP lives in the network layer model Protocols that utilize TCP/IPProtocols that utilize TCP/IP Under the hood of IPUnder the hood of IP
• Addressing and RoutingAddressing and Routing Under the hood of TCP (and UDP)Under the hood of TCP (and UDP)
• Ensuring reliable deliveryEnsuring reliable delivery Weaknesses of TCP/IPWeaknesses of TCP/IP Resources for more infoResources for more info
What We’ll CoverWhat We’ll Cover
All topics should be considered All topics should be considered overviewsoverviews
References for more depth on each References for more depth on each subject will be given at the endsubject will be given at the end
Programming with sockets will be Programming with sockets will be covered in next sessioncovered in next session
Feel free to interrupt with questions Feel free to interrupt with questions at any timeat any time
What is TCP/IP?What is TCP/IP?
Set of protocols that are used for Set of protocols that are used for communication across a networkcommunication across a network
TCP/IPTCP/IP = Transmission Control = Transmission Control Protocol / Internet ProtocolProtocol / Internet Protocol
UDPUDP = User Datagram Protocol = User Datagram Protocol Standard method for transferring Standard method for transferring
data and information on the data and information on the InternetInternet
What is a protocol?What is a protocol?
Definition: Definition: A set of rules that regulate the way data is transmitted between computers.
An infinite amount of ways to An infinite amount of ways to realize this abstract notion - so realize this abstract notion - so why did the Internet standardize why did the Internet standardize on TCP/IP?on TCP/IP?
Why TCP/IP?Why TCP/IP?
‘‘cuz Uncle Sam said so!cuz Uncle Sam said so! Originally a set of conventions Originally a set of conventions
developed by the DOD and DARPA in developed by the DOD and DARPA in 1969, formalized into TCP/IP in 1980s1969, formalized into TCP/IP in 1980s
Original ideas attributed to Vinton Cerf Original ideas attributed to Vinton Cerf and Robert Kahnand Robert Kahn
Gained popularity in the user Gained popularity in the user community because of inclusion in v4.2 community because of inclusion in v4.2 of BSD UNIXof BSD UNIX
Why TCP/IP?Why TCP/IP?
DARPA network was the early precursor DARPA network was the early precursor of the Internet of the Internet
If you wanted to talk on the DARPANET If you wanted to talk on the DARPANET you needed to speak TCP/IPyou needed to speak TCP/IP
TCP/IP was designed well enough to TCP/IP was designed well enough to scale to the Internet*scale to the Internet*
* - until recently...* - until recently...
Why TCP/IP?Why TCP/IP?
Three Main Goals:Three Main Goals:• InteroperabilityInteroperability - communicate - communicate
between heterogeneous hardware between heterogeneous hardware and OSand OS
• RobustnessRobustness - reliability and - reliability and performanceperformance
• Ease of ReconfigurationEase of Reconfiguration - add and - add and remove computers without disruptionremove computers without disruption
ISO OSI 7-layer modelISO OSI 7-layer model
ISO developed the 7-layer Open ISO developed the 7-layer Open Systems Interconnect (OSI) model Systems Interconnect (OSI) model independent of TCP/IP in the 1970sindependent of TCP/IP in the 1970s
Allows each layer of a protocol to Allows each layer of a protocol to be changed without affecting be changed without affecting layers above or belowlayers above or below
ApplicationPresentationSessionTransportNetworkData LinkPhysical
Layer 7: interfaces with end user
Layer 6: data format conversion
Layer 5: establishes node connection
Layer 4: ensures delivery and correctness
Layer 3: routing and addressing
Layer 2: interface for physical line (NIC)
Layer 1: actual transmission line or “bit pipe”
ISO OSI 7-layer modelISO OSI 7-layer model
Modified Conceptual 5 Modified Conceptual 5 Layer ModelLayer Model
Top three layers ISO OSI model Top three layers ISO OSI model don’t relate well to Internet don’t relate well to Internet protocols using TCP/IPprotocols using TCP/IP
Conceptually it helps to think Conceptually it helps to think about a 5 layer model for the about a 5 layer model for the Internet and TCP/IPInternet and TCP/IP
ApplicationPresentationSessionTransportNetworkData LinkPhysical
Modified 5 Layer Modified 5 Layer Conceptual ModelConceptual Model
Application
TransportNetworkData LinkPhysical
TCP/IP In the 5 Layer TCP/IP In the 5 Layer ModelModel
TCPTCP handles the transport layer handles the transport layer and guarantees data delivery and and guarantees data delivery and correctnesscorrectness
UDPUDP is a TCP replacement that is a TCP replacement that doesn’t guarantee deliverydoesn’t guarantee delivery
IPIP lives in the network layer and lives in the network layer and handles routing and addressinghandles routing and addressing
TCP/IP In the 5 Layer TCP/IP In the 5 Layer ModelModel
Application
Transport: TCP, UDP
Network: IP, ICMP, IGMP
Data Link: LLC, MAC
Physical: Ethernet, Token Ring, PPP
Stream Connection Connectionless DatagramSockets API
IP InternalsIP Internals
Current version in widespread use Current version in widespread use is IPv4is IPv4
Each node in an internet has a 32-Each node in an internet has a 32-bit IP address such as 10.0.3.172bit IP address such as 10.0.3.172
IP knows nothing of text names like IP knows nothing of text names like www.bsquare.com - they are www.bsquare.com - they are translated to the numeric form by translated to the numeric form by DNSDNS
IP InternalsIP Internals
IP addresses are split into two parts:IP addresses are split into two parts:• networknetwork - same for all hosts on the - same for all hosts on the
same networksame network• hosthost - identifies a specific host within a - identifies a specific host within a
networknetwork The number of bits that represent The number of bits that represent
the network and host vary by the the network and host vary by the address “class”address “class”
IP InternalsIP Internals
0 Network Host Class A
1 0 Network Host Class B
1 1 0 Network Host Class C
7 24
14 16
21 8
IP Internals IP Internals
Original idea was to have a small Original idea was to have a small number of WANs (class A), modest number of WANs (class A), modest number of campus size networks number of campus size networks (class B) and a large number of LANs (class B) and a large number of LANs (class C)(class C)
Explosion of the Internet has changed Explosion of the Internet has changed this - many clever interpretations of IP this - many clever interpretations of IP addresses have been invented to addresses have been invented to stretch the limitstretch the limit
IP InternalsIP Internals
IP routes information across a IP routes information across a network via “packet switching” (as network via “packet switching” (as opposed to circuit switching)opposed to circuit switching)
Each packet is transmitted as a Each packet is transmitted as a separate entityseparate entity
Different packets can take Different packets can take different routes and can arrive in different routes and can arrive in different order than they were sentdifferent order than they were sent
IP InternalsIP Internals
Packets are sent as datagrams, so Packets are sent as datagrams, so delivery isn’t guaranteeddelivery isn’t guaranteed
Each packet has an IP header that Each packet has an IP header that contains source and destination contains source and destination address, data and header length, etcaddress, data and header length, etc
Packets are routed based on the Packets are routed based on the network specified in the destination network specified in the destination addressaddress
IP InternalsIP Internals
If the source and destination If the source and destination address are on the same network address are on the same network life is simple (e.g. Ethernet uses life is simple (e.g. Ethernet uses ARP to get the MAC address)ARP to get the MAC address)
If the source and destination If the source and destination address are on different networks address are on different networks it is more complicated...it is more complicated...
IP InternalsIP Internals
Special nodes called “gateways” Special nodes called “gateways” connect networks connect networks
Gateways have tables that map Gateways have tables that map network numbers to gateway addressesnetwork numbers to gateway addresses
Datagrams are forwarded to the Datagrams are forwarded to the gateway corresponding to their gateway corresponding to their destination network numberdestination network number
What if there is no gateway available?What if there is no gateway available?
IP InternalsIP Internals
Default gateways are used if no Default gateways are used if no mapping is presentmapping is present
Once a mapping is found the Once a mapping is found the sender is notified of the correct sender is notified of the correct gateway mapping (via ICMP)gateway mapping (via ICMP)
Over time, routers build up a Over time, routers build up a mapping table based on ICMP mapping table based on ICMP notificationsnotifications
IP InternalsIP Internals
A simple routing example via TraceRoute:A simple routing example via TraceRoute: 1 www.worldaccessnet.com (206.190.139.3)1 www.worldaccessnet.com (206.190.139.3)
2 2 worldaccessnet-2t1-ltipdxbackbone.ltinet.net worldaccessnet-2t1-ltipdxbackbone.ltinet.net (206.190.136.117)(206.190.136.117)
3 3 pdx2lc.worldaccessnet.com (206.190.136.6)pdx2lc.worldaccessnet.com (206.190.136.6)
4 4 seattle-portland-ds3.sea.above.net (seattle-portland-ds3.sea.above.net (209.133.31.50))
5 5 POS1-0-0.GW2.SEA4.ALTER.NET (157.130.177.121)POS1-0-0.GW2.SEA4.ALTER.NET (157.130.177.121)
6 6 112.ATM3-0.XR2.SEA4.ALTER.NET (146.188.200.174)112.ATM3-0.XR2.SEA4.ALTER.NET (146.188.200.174)
7 7 292.ATM3-0.XR2.SEA1.ALTER.NET (146.188.200.157)292.ATM3-0.XR2.SEA1.ALTER.NET (146.188.200.157)
8 8 194.ATM9-0-0.GW1.SEA1.ALTER.NET (146.188.200.45)194.ATM9-0-0.GW1.SEA1.ALTER.NET (146.188.200.45)
9 9 63.76.82.94 (63.76.82.94)63.76.82.94 (63.76.82.94)
10 10 www.bsquare.com (63.76.82.70)www.bsquare.com (63.76.82.70)
IP InternalsIP Internals
TTL (Time To Live) field in IP TTL (Time To Live) field in IP header eliminates endless routing header eliminates endless routing loops by limiting hop countloops by limiting hop count
127.0.0.1 is a special loopback 127.0.0.1 is a special loopback addressaddress
UDP InternalsUDP Internals
Ensures data correctness, but not Ensures data correctness, but not reliable deliveryreliable delivery
Adds a “port” number to IP Adds a “port” number to IP Think of a port as channels for a Think of a port as channels for a
single machine - more on this in single machine - more on this in the discussion of socketsthe discussion of sockets
UDP InternalsUDP Internals
Sends entire chuck of data in one Sends entire chuck of data in one packetpacket
Sends datagrams in one directionSends datagrams in one direction
TCP InternalsTCP Internals
Lots of versions floating around: Lots of versions floating around: • Tahoe - released with BSD NR 1.0Tahoe - released with BSD NR 1.0• Reno - released with BSD NR 2.0Reno - released with BSD NR 2.0• New TCP RenoNew TCP Reno• TCP VegasTCP Vegas
Versions are guaranteed to Versions are guaranteed to interoperate but not with optimal interoperate but not with optimal performanceperformance
TCP InternalsTCP Internals
Guarantees data correctness and Guarantees data correctness and deliverydelivery
Uses ports identical to UDPUses ports identical to UDP Breaks data into individual packetsBreaks data into individual packets Full duplex two-way streamFull duplex two-way stream Complete implementation is Complete implementation is
complicatedcomplicated with lots of intricate details with lots of intricate details - we’ll touch on interesting highlights- we’ll touch on interesting highlights
TCP InternalsTCP Internals
Operates on two basic principles: Operates on two basic principles: flow flow controlcontrol and and congestion controlcongestion control
Flow controlFlow control involves preventing involves preventing senders from overrunning the capacity senders from overrunning the capacity of receiversof receivers
Congestion controlCongestion control involves preventing involves preventing too much data from being injected into too much data from being injected into the network, causing links and switches the network, causing links and switches to become overloadedto become overloaded
TCP InternalsTCP Internals
Follows a basic protocol design rule Follows a basic protocol design rule called “smart sender, dumb receiver”called “smart sender, dumb receiver”
Flow control done via “sliding window”Flow control done via “sliding window”• For window size n, only n bytes can be For window size n, only n bytes can be
sent without receiving an sent without receiving an acknowledgementacknowledgement
• When data is acknowledged, the window When data is acknowledged, the window slides forwardslides forward
TCP InternalsTCP Internals
TCP packet header advertises a TCP packet header advertises a window size indicating the number window size indicating the number of bytes the receiver is willing to of bytes the receiver is willing to getget
Initial window size established in Initial window size established in TCP connection setupTCP connection setup
TCP InternalsTCP Internals
Packet header includes the last Packet header includes the last byte acknowledged and the packet byte acknowledged and the packet sequence numbersequence number
Sequence numbers are used to Sequence numbers are used to reassemble packets in the order reassemble packets in the order they were sentthey were sent
TCP InternalsTCP Internals
4 5 6 7 8 91 2 3 10 11 12
offered window(advertised by receiver)
usable window
sent andacknowledged
sent, not ACKed
can send ASAPcan’t send untilwindow moves
Left side of window advances when data is acknowledgedRight side controlled by size of window advertisement
TCP InternalsTCP Internals
What if receiver’s buffer fills up and What if receiver’s buffer fills up and results in an advertised window size of 0?results in an advertised window size of 0?
TCP periodically sends a 1-byte “probe” TCP periodically sends a 1-byte “probe” packet which fails but has a new advertised packet which fails but has a new advertised window sizewindow size
EffectiveWindow = AdvertisedWindow - (LastByteSent - LastByteAcked)
TCP InternalsTCP Internals
Acks indicate last consecutive Acks indicate last consecutive packet receivedpacket received
Packets are retransmitted if an Packets are retransmitted if an ACK is not received after a certain ACK is not received after a certain time periodtime period
Timeout value varies depending on Timeout value varies depending on previous packets average round previous packets average round trip time (RTT)trip time (RTT)
TCP InternalsTCP Internals
Congestion control is built on top Congestion control is built on top of sliding window flow controlof sliding window flow control
Consists of three intertwined Consists of three intertwined mechanisms:mechanisms:• Additive Increase / Multiplicative Additive Increase / Multiplicative
DecreaseDecrease• Slow StartSlow Start• Fast RetransmitFast Retransmit
TCP InternalsTCP Internals
An additional window size is An additional window size is maintained in each packet header maintained in each packet header called the congestion windowcalled the congestion window
Similar to advertised window, but Similar to advertised window, but not directly controlled by sender or not directly controlled by sender or receiverreceiver
TCP InternalsTCP Internals
Effective window size calculation Effective window size calculation changes:changes:
MaxWindow = MIN(CongestionWindow,AdvertisedWindow)
EffectiveWindow = MaxWindow - (LastByteSent - LastByteAcked)EffectiveWindow = MaxWindow - (LastByteSent - LastByteAcked)
How is congestion window size How is congestion window size calculated?calculated?
TCP InternalsTCP Internals
Initially it is set to the Maximum Initially it is set to the Maximum Segment Size (MSS)Segment Size (MSS)
Whenever a congestion window Whenever a congestion window size is successfully transmitted, size is successfully transmitted, the size is incremented by MSS - the size is incremented by MSS - hence the term “additive increase”hence the term “additive increase”
TCP InternalsTCP Internals
If a packet is dropped (e.g an ACK If a packet is dropped (e.g an ACK times out), it is assumed to be due times out), it is assumed to be due to network congestionto network congestion
When a packet is dropped, the When a packet is dropped, the congestion window size is cut in congestion window size is cut in half - hence the term half - hence the term “multiplicative decrease”“multiplicative decrease”
TCP InternalsTCP Internals
Result is that the window size is Result is that the window size is eased up until a packet is dropped eased up until a packet is dropped and then it is throttled backand then it is throttled back
Works OK during the middle of a Works OK during the middle of a connection, but takes too long to connection, but takes too long to ramp up when starting from ramp up when starting from scratch...scratch...
TCP InternalsTCP Internals
Slow Start addresses initial connection Slow Start addresses initial connection issue and temporarily discards additive issue and temporarily discards additive increaseincrease
Congestion window size starts at 1 Congestion window size starts at 1 packet and is doubled every time a full packet and is doubled every time a full window is successfully transmittedwindow is successfully transmitted
Eventually a packet is dropped and Eventually a packet is dropped and additive increase is resumedadditive increase is resumed
TCP InternalsTCP Internals
Why is it called Slow Start if it Why is it called Slow Start if it changes from linear to exponential changes from linear to exponential growth of congestion window size?growth of congestion window size?
Refers to difference when compared to Refers to difference when compared to original TCP strategy of always starting original TCP strategy of always starting with full advertised window sizewith full advertised window size
TCP InternalsTCP Internals
Fast retransmit was not part of Fast retransmit was not part of original TCP specoriginal TCP spec
Added by TCP Reno circa 1990 to Added by TCP Reno circa 1990 to deal with performance problemsdeal with performance problems
TCP InternalsTCP Internals
Fast Retransmit means that if the Fast Retransmit means that if the sender sees a number of duplicate sender sees a number of duplicate ACKs it retransmits first packet ACKs it retransmits first packet after ACKafter ACK
Assumes that a number of Assumes that a number of duplicate ACKs imply a dropped duplicate ACKs imply a dropped packetpacket
TCP InternalsTCP Internals
Packet 1
ACK 1
Packet 2
Packet 3
ACK 1
ACK 1
Packet 4
Packet 5
Packet 2
ACK 5
Fast Retransmit in action!Fast Retransmit in action!
TCP/IP WeaknessesTCP/IP Weaknesses
IPIP• address space is too small address space is too small • size of routing information size of routing information
transmitted and stored is too bigtransmitted and stored is too big• lack of real-time support necessary lack of real-time support necessary
for voice and multimediafor voice and multimedia
TCP/IP WeaknessesTCP/IP Weaknesses
Being addressed by IPv6Being addressed by IPv6• Increases address space to 128 bitsIncreases address space to 128 bits• Over 1500 addresses per square foot Over 1500 addresses per square foot
of the earth’s surface!of the earth’s surface! Difficult to roll out and guarantee Difficult to roll out and guarantee
cooperation with IPv4cooperation with IPv4
TCP/IP WeaknessesTCP/IP Weaknesses
TCPTCP• congestion control algorithm is a congestion control algorithm is a
problem over wireless connectionsproblem over wireless connections• maximum packet size of 64K and 32-maximum packet size of 64K and 32-
bit sequence number is too small for bit sequence number is too small for broadband pipesbroadband pipes
• reliability guarantee causes reliability guarantee causes degradation in multimedia streamsdegradation in multimedia streams
TCP/IP WeaknessesTCP/IP Weaknesses
TCP has unused header bits that TCP has unused header bits that could be used for a temporary could be used for a temporary hackhack
No structured initiative like IPv6 for No structured initiative like IPv6 for solving TCP issuessolving TCP issues
Resources for the Curious Resources for the Curious and Diligentand Diligent
RFCs at www.faqs.org/rfcsRFCs at www.faqs.org/rfcs Computer Networks: A Systems Computer Networks: A Systems
Perspective by Peterson and DaviePerspective by Peterson and Davie Internetworking with TCP/IP 1, 2, Internetworking with TCP/IP 1, 2,
and 3 by Doug Comerand 3 by Doug Comer TCP/IP Illustrated 1, 2, and 3 by TCP/IP Illustrated 1, 2, and 3 by
Richard StevensRichard Stevens
Resources for the Curious Resources for the Curious and Diligentand Diligent
Understanding IP Addressing at Understanding IP Addressing at /www.3com.com/nsc/501302s.html/www.3com.com/nsc/501302s.html
2 part article on embedding a 2 part article on embedding a TCP/IP stack in Dec 99 and Jan 99 TCP/IP stack in Dec 99 and Jan 99 issues of ESPissues of ESP