Upload
morris-booth
View
215
Download
2
Embed Size (px)
Citation preview
Hybrid Reliable Multicast with TCP-XM
Jon CrowcroftMarinho P Barcellos, UNISINOS
Stefano Pettini, ESA
Karl Jeacle
CoNEXT 2005
Rationale
Would like to achieve high-speed bulk data delivery to multiple sites
Multicasting would make sense Existing multicast research has focused on
sending to a large number of receivers But Grid is an applied example where sending to
a moderate number of receivers would be extremely beneficial
CoNEXT 2005
Multicast availability
Deployment is a problem!Protocols have been defined and implementedValid concerns about scalability; much FUD “chicken & egg” means limited coverage
Clouds of native multicastBut can’t reach all destinations via multicastSo applications abandon in favour of unicast
What if we could multicast when possible……but fall back to unicast when necessary?
CoNEXT 2005
Multicast TCP?
TCP “single reliable stream between two hosts”
Multicast TCP “multiple reliable streams from one to n hosts”
May seem a little odd, but there is precedent…TCP-XMO – Liang & CheritonM-TCP – Mysore & VargheseM/TCP – Visoottiviseth et alPRMP – Barcellos et alSCE – Talpade & Ammar
CoNEXT 2005
ACK implosion
CoNEXT 2005
Building Multicast TCP
Want to test multicast/unicast TCP approachBut new protocol == kernel changeWidespread test deployment difficult
Build new TCP-like engineEncapsulate packets in UDPRun in userspacePerformance is sacrificed……but widespread testing now possible
CoNEXT 2005
TCP/IP/UDP/IP
SendingApplication
TCP
IP
UDP
IP
ReceivingApplication
TCP
IP
UDP
IP
If natively implemented
test deployment
CoNEXT 2005
TCP engine
Where does initial TCP come from? Could use BSD or Linux
Extracting from kernel could be problematic
More compact alternative lwIP = Lightweight IPSmall but fully RFC-compliant TCP/IP stack
lwIP + multicast extensions = “TCP-XM”
CoNEXT 2005
TCP-XM overview
Primarily aimed at push applications Sender initiated – advance knowledge of receivers Opens sessions to n destination hosts simultaneously Unicast is used when multicast not available Options headers used to exchange multicast info API changes
Sender incorporates multiple destination and group addresses Receiver requires no changes
TCP friendly, by definition
CoNEXT 2005
TCP
SYN
SYNACK
ACK
FIN
FIN
ACK
ACK
ACK
DATASender Receiver
CoNEXT 2005
TCP-XM
Sender
Receiver 1
Receiver 2
Receiver 3
CoNEXT 2005
TCP-XM connection
ConnectionUser connects to multiple unicast destinationsMultiple TCP PCBs created Independent 3-way handshakes take placeSSM or random ASM group address allocated
(if not specified in advance by user/application)
Group address sent as TCP optionAbility to multicast depends on TCP option
CoNEXT 2005
TCP Group Option
New group option sent in all TCP-XM SYN packets Non TCP-XM hosts will ignore (no option in SYNACK) Presence implies multicast capability Sender will automatically revert to unicast
1 byte 1 byte 4 bytes
kind=50 len=6 Multicast Group Address
CoNEXT 2005
TCP-XM transmission
Data transferData replicated/enqueued on all send queuesPCB variables dictate transmission modeData packets are multicast (if possible)Retransmissions are unicastAuto fall back/forward to unicast/multicast
CloseConnections closed as per unicast TCP
CoNEXT 2005
TCP-XM protocol states
CoNEXT 2005
Fall back / fall forward
TCP-XM principle “Multicast if possible, unicast when necessary”
Initial transmission mode is group unicastEnsures successful initial data transfer
Fall forward to multicast on positive feedbackTypically after ~75K unicast data
Fall back to unicast on repeated mcast failure
CoNEXT 2005
Multicast feedback Option
New feedback option sent in ACKs from receiver Only used between TCP-XM hosts Indicates % of last n packets received via multicast Used by sender to fall forward to multicast transmission
1 byte 1 byte 1 byte
kind=51 len=3 % Multicast Packets Received
CoNEXT 2005
TCP-XM reception
ReceiverNo API-level changesNormal TCP listenAuto-IGMP join on TCP-XM connectAccepts data on both unicast/multicast ports tcp_input() accepts:
packets addressed to existing unicast destination……but now also those addressed to multicast group
Tracks how last n segs received (u/m)
CoNEXT 2005
API changes
Only relevant if natively implemented!
Sender API changesNew connection typeConnect to port on array of destinationsSingle write sends data to all hosts
TCP-XM in use:conn = netconn_new(NETCONN_TCPXM);
netconn_connectxm(conn, remotedest, numdests, group, port);
netconn_write(conn, data, len, …);
CoNEXT 2005
PCB changes
Every TCP connection has an associated Protocol Control Block (PCB)
TCP-XM adds:struct tcp_pcb {
…
struct tcp_pcb *firstpcb;/* first of the mpcbs */
struct tcp_pcb *nextm; /* next tcpxm pcb */
enum tx_mode txmode; /* unicasting or multicasting */
u8_t nrtxm; /* number of retransmits for multicast */
u32_t nrtxmtime; /* time since last retransmit */
u32_t mbytessent; /* total bytes sent via multicast */
u32_t mbytesrcvd; /* total bytes received via multicast */
u32_t ubytessent; /* total bytes sent via unicast */
u32_t ubytesrcvd; /* total bytes received via unicast */
struct segrcv msegrcv[128]; /* ismcast boolean for last n segs */
u8_t msegrcvper; /* % of last segs received via mcast */
u8_t msegrcvcnt; /* counter for segs recvd via mcast */
u8_t msegsntper; /* % of last segs delivered via mcast */
}
CoNEXT 2005
Linking PCBs
*next points to the next TCP session *nextm points to the next TCP session that’s part of a
particular TCP-XM connection Minimal timer and state machine changes
next
nextm
M1
next
nextm
U1
next
nextm
M2
next
nextm
M3
next
nextm
U2
CoNEXT 2005
What happens to the cwin?
Multiple receiversMultiple PCBsMultiple congestion windows
Default to min(cwin) i.e. send at rate of slowest receiver
Is this really so bad?Compare to time taken for n unicast transfers
CoNEXT 2005
LAN speed
CoNEXT 2005
LAN efficiency
CoNEXT 2005
WAN speed
CoNEXT 2005
WAN efficiency
CoNEXT 2005
Multiple TCP-XM flows
CoNEXT 2005
TCP-XM vs TCP flows
CoNEXT 2005
Protocol performance
CoNEXT 2005
Protocol efficiency
CoNEXT 2005
Grid multicast?
How can multicast be used in Grid environment?
TCP-XM is new multicast-capable protocol
Globus is de-facto Grid middleware
Would like TCP-XM support in Globus…
CoNEXT 2005
Globus XIO
eXtensible Input Output libraryAllows “i/o plugins” to Globus
APISingle POSIX-like API / set of semanticsSimple open/close/read/write API
Driver abstractionHides protocol details / Allows for extensibilityStack of 1 transport & n transform driversDrivers can be selected at runtime
CoNEXT 2005
XIO architecture
CoNEXT 2005
XIO implementation
CoNEXT 2005
XIO/XM driver specifics
Two important XIO data structures
1. Handle Returned to user when XIO framework ready Used for all open/close/read/write calls lwIP netconn connection structure used
2. Attribute Used to set XIO driver-specific parameters… … and TCP-XM protocol-specific options List of destination addresses
CoNEXT 2005
XIO code example// init stackglobus_xio_stack_init(&stack, NULL);
// load drivers onto stackglobus_xio_driver_load("tcpxm", &txdriver);globus_xio_stack_push_driver(stack, txdriver);
// init attributesglobus_xio_attr_init(&attr);globus_xio_attr_cntl(attr, txdriver, GLOBUS_XIO_TCPXM_SET_REMOTE_HOSTS,
hosts, numhosts);
// create handleglobus_xio_handle_create(&handle, stack);
// send dataglobus_xio_open(&handle, NULL, target);globus_xio_write(handle, "hello\n", 6, 1, &nbytes, NULL);globus_xio_close(handle, NULL);
CoNEXT 2005
One-to-many issues
Stack assumes one-to-one connectionsXIO user interface requires modificationNeeds support for one-to-many protocolsMinimal user API changesFramework changes more significant
GSI is one-to-oneAuthentication with peer on connection setupBut cannot authenticate with n peersNeed some form of “GSI-M”
CoNEXT 2005
Driver availability
Multicast transport driver for Globus XIORequires Globus 3.2 or later
Source code onlineSample clientSample serverDriver installation instructions
http://www.cl.cam.ac.uk/~kj234/xio/
CoNEXT 2005
mcp & mcpd
Multicast file transfer application using TCP-XM
‘mcpd &’ on servers ‘mcp file host1 host2… hostN’ on client
http://www.cl.cam.ac.uk/~kj234/mcp/Full source code onlineFreeBSD, Linux, Solaris
CoNEXT 2005
All done!
Thanks for listening!
Questions?