114
Shivkumar Kalyanaraman Rensselaer Polytechnic Institute 1 Multicast Shivkumar Kalyanaraman [email protected] http://www.ecse.rpi.edu/Homepages/shivkuma Adapted in part from Srini Seshan’s (CMU) and Ion Stoica (UCB)

IP: Addresses and Forwarding

  • Upload
    others

  • View
    12

  • Download
    0

Embed Size (px)

Citation preview

Page 1: IP: Addresses and Forwarding

Shivkumar KalyanaramanRensselaer Polytechnic Institute

1

Multicast

Shivkumar [email protected]

http://www.ecse.rpi.edu/Homepages/shivkumaAdapted in part from Srini Seshan’s (CMU) and Ion Stoica (UCB)

Page 2: IP: Addresses and Forwarding

Shivkumar KalyanaramanRensselaer Polytechnic Institute

2

Overview

Why multicast ? Multicast apps ...Concepts: groups, scopes, treesMulticast addresses, LAN multicastGroup management: IGMPMulticast routing and forwarding: MBONE, PIMReliable Multicast Transport Protocols

Multicast Congestion Control Deployment issues, Source-Specific Multicast (SSM), Application-level Multicast

Page 3: IP: Addresses and Forwarding

Shivkumar KalyanaramanRensselaer Polytechnic Institute

3

Multicast = Efficient Data Distribution

Src Src

Page 4: IP: Addresses and Forwarding

Shivkumar KalyanaramanRensselaer Polytechnic Institute

4

Why Multicast ?Need for efficient one-to-many delivery of same dataApplications:

News/sports/stock/weather updatesDistance learningConfiguration, routing updates, service locationPointcast-type “push” appsTeleconferencing (audio, video, shared whiteboard, text editor)Distributed interactive gaming or simulationsEmail distribution listsContent distribution; Software distributionWeb-cache updates Database replication

Page 5: IP: Addresses and Forwarding

Shivkumar KalyanaramanRensselaer Polytechnic Institute

5

Why Not Broadcast or Unicast?Broadcast:

Send a copy to every machine on the netSimple, but inefficientAll nodes must process packet even if they don’t careWastes more CPU cycles of slower machines (“broadcast radiation”)Network loops lead to “broadcast storms”

Replicated Unicast:Sender sends a copy to each receiver in turnReceivers need to register or sender must be pre-configuredSender is focal point of all control trafficReliability => per-receiver state, separate sessions/processes at sender

Page 6: IP: Addresses and Forwarding

Shivkumar KalyanaramanRensselaer Polytechnic Institute

6

Why IP multicast ?Application-layer relays:

A “relay” node or set of nodes does the replicatedunicast function instead of the sourceMultiple relays can handle “groups” of receivers and reduce number of packets per multicast => efficiencyManager has to manually configure names of receivers in relays etc App-level topology may be sub-optimal

But bandwidth is becoming cheaperBecoming more popular in content distribution

IP Multicast: replication/multicast engine at the network layer

Page 7: IP: Addresses and Forwarding

Shivkumar KalyanaramanRensselaer Polytechnic Institute

7

Multicast Apps CharacteristicsNumber of (simultaneous) senders to the groupThe size of the groups

Number of members (receivers)Geographic extent or scopeDiameter of the group measured in router hops

The longevity of the groupNumber of aggregate packets/secondThe peak/average used by sourceLevel of human interactivity

Lecture mode vs interactiveData-only (eg database replication) vs multimedia

Page 8: IP: Addresses and Forwarding

Shivkumar KalyanaramanRensselaer Polytechnic Institute

8

IP Multicast Architecture

Hosts

Routers

Service modelService model

Host-to-router protocol(IGMP)

Multicast routing protocols(various)

Page 9: IP: Addresses and Forwarding

Shivkumar KalyanaramanRensselaer Polytechnic Institute

9

IP Multicast model: RFC 1112Message sent to multicast “group” (of receivers)

Senders need not be group membersA group identified by a single “group address”

Use “group address” instead of destination address in IP packet sent to group

Groups can have any size;Group members can be located anywhere on the InternetGroup membership is not explicitly knownReceivers can join/leave at will

Page 10: IP: Addresses and Forwarding

Shivkumar KalyanaramanRensselaer Polytechnic Institute

10

IP Multicast Concepts (Continued)Packets are not duplicated or delivered to destinations outside the group

Distribution tree constructed for delivery of packetsPackets forwarded “away” from the sourceNo more than one copy of packet appears on any subnet Packets delivered only to “interested” receivers => multicast delivery tree changes dynamicallyNetwork has to actively discover paths between senders and receivers

Page 11: IP: Addresses and Forwarding

Shivkumar KalyanaramanRensselaer Polytechnic Institute

11

IP Multicast AddressesClass D IP addresses

224.0.0.0 – 239.255.255.255

Address allocation:Well-known (reserved) multicast addresses, assigned by IANA: 224.0.0.x and 224.0.1.xTransient multicast addresses, assigned and reclaimed dynamically, e.g., by “sdr” program

Each multicast address represents a group of arbitrary size, called a “host group”There is no structure within class D address space like subnetting => flat address space

1 1 1 0 Group ID

Page 12: IP: Addresses and Forwarding

Shivkumar KalyanaramanRensselaer Polytechnic Institute

12

IP Multicast Service — SendingUses normal IP-Send operation, with an IP multicast address specified as the destinationMust provide sending application a way to:

Specify outgoing network interface, if >1 availableSpecify IP time-to-live (TTL) on outgoing packetEnable/disable loop-back if the sending host is/isnt a member of the destination group on the outgoing interface

Page 13: IP: Addresses and Forwarding

Shivkumar KalyanaramanRensselaer Polytechnic Institute

13

IP Multicast Service — ReceivingTwo new operations

Join-IP-Multicast-Group(group-address, interface)Leave-IP-Multicast-Group(group-address, interface)

Receive multicast packets for joined groups via normal IP-Receive operation

Page 14: IP: Addresses and Forwarding

Shivkumar KalyanaramanRensselaer Polytechnic Institute

14

Link-Layer Transmission/ReceptionTransmission• IP multicast packet is transmitted as a link-layer

multicast, on those links that support multicast• Link-layer destination address is determined by an

algorithm specific to the type of link• Reception

• Necessary steps are taken to receive desired multicasts on a particular link, such as modifying address reception filters on LAN interfaces

• Multicast routers must be able to receive all IP multicasts on a link, without knowing in advance which groups will be used

Page 15: IP: Addresses and Forwarding

Shivkumar KalyanaramanRensselaer Polytechnic Institute

15

Using Link-Layer Multicast AddressesEthernet and other LANs using 802 addresses:

Direct mapping! Simpler than unicast! No ARP etc.32 class D addrs may map to one MAC addr

Special OUI for IETF: 0x01-00-5E. No mapping needed for point-to-point links

LAN multicast address

0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 1 0 1 1 1 1 0 0

1 1 1 0 28 bits

23 bits

IP multicast address

Group bit

Page 16: IP: Addresses and Forwarding

Shivkumar KalyanaramanRensselaer Polytechnic Institute

16

Multicast over LANs & ScopingMulticasts are flooded across MAC-layer bridges along a spanning tree

But flooding may steal sending opportunity for non-member stations which want to transmitAlmost like broadcast!

Scope: How far do transmissions propagate?Implicit scoping: Reserved Mcast addresses => don’t leave subnet.

Also called “link-local” addresses

Page 17: IP: Addresses and Forwarding

Shivkumar KalyanaramanRensselaer Polytechnic Institute

17

Scope of Multicast ForwardingTTL-based scoping:

Multicast routers have a configured TTL thresholdMcast datagram dropped if TTL <= TTL thresholdUseful as a blanket parameter.

Administrative scoping:Use a portion of class D address space (239.0.0.0 thru 239.255.255.255)Truly local to admin domain; address reuse possible. In IPv6, scoping is an internal attribute of an IPv6 multicast address

Page 18: IP: Addresses and Forwarding

Shivkumar KalyanaramanRensselaer Polytechnic Institute

18

Multicast Scope Control – Small TTLsTTL expanding-ring search to reach or find a nearby subset of a groupRings can be nested, but not overlapping

s

1

2

3

Page 19: IP: Addresses and Forwarding

Shivkumar KalyanaramanRensselaer Polytechnic Institute

19

Multicast Scope ControlAdministratively-Scoped Addresses (RFC 1112 )

Uses address range 239.0.0.0 — 239.255.255.255Supports overlapping (not just nested) domains

An administrative domain

The rest of the Internet

address boundary set oninterfaces to these links

Page 20: IP: Addresses and Forwarding

Shivkumar KalyanaramanRensselaer Polytechnic Institute

20

IP Multicast Architecture

Hosts

Routers

Service model

HostHost--toto--router protocolrouter protocol(IGMP)(IGMP)

Multicast routing protocols(various)

Page 21: IP: Addresses and Forwarding

Shivkumar KalyanaramanRensselaer Polytechnic Institute

21

Internet Group Management ProtocolIGMP: “signaling” protocol to establish, maintain, remove groups on a subnet.Objective: keep router up-to-date with group membership of entire LAN

Routers need not know who all the members are, only that members exist

Each host keeps track of which mcast groups are subscribed to

Socket API informs IGMP process of all joins

Page 22: IP: Addresses and Forwarding

Shivkumar KalyanaramanRensselaer Polytechnic Institute

22

How IGMP Works

QRouters:

Hosts:

On each link, one router is elected the “querier”Querier periodically sends a Membership Querymessage to the all-systems group (224.0.0.1), with TTL = 1On receipt, hosts start random timers (between 0 and 10 seconds) for each multicast group to which they belong

Page 23: IP: Addresses and Forwarding

Shivkumar KalyanaramanRensselaer Polytechnic Institute

23

How IGMP Works (cont.)

Q

G G G G

Routers:

Hosts:

When a host’s timer for group G expires, it sends a Membership Report to group G, with TTL = 1Other members of G hear the report and stop (suppress) their timersRouters hear all reports, and time out non-responding groups

Page 24: IP: Addresses and Forwarding

Shivkumar KalyanaramanRensselaer Polytechnic Institute

24

How IGMP Works (cont.)Normal case: only one report message per group present is sent in response to a query

Query interval is typically 60-90 secondsWhen a host first joins a group, it sends immediate reports, instead of waiting for a query

IGMPv2: Hosts may send a “Leave group” message to “all routers” (224.0.0.2) address

Querier responds with a Group-specific Query message: see if any group members are present

Lower leave latency

Page 25: IP: Addresses and Forwarding

Shivkumar KalyanaramanRensselaer Polytechnic Institute

25

IGMPv2

DR adds membership

1.1.1.1

H1 H3

1.1.1.10 1.1.1.11 1.1.1.12

H3

rtr-a

H2 Report for232.1.2.3

Host sends IGMPv2 report for group 1st *,G join

DR sends *,G join to RP (it has to, it doesn’t know the sources)

2nd S,G join(s)

DR sends S,G join to source (data provides the sources)

Page 26: IP: Addresses and Forwarding

Shivkumar KalyanaramanRensselaer Polytechnic Institute

26

IGMPv3

1.1.1.1

H1 H3

1.1.1.10 1.1.1.11 1.1.1.12

H3

rtr-a

H2 Report for232.1.2.3

source_list

Host sends IGMPv3 report for group which can specify a list of sources to explicitly include.

DR adds membership.

S,G join(s)

DR sends S,G join directly to sources in the source_list, and isnot required to send *,G join to RP (see SSM discussion later)

Page 27: IP: Addresses and Forwarding

Shivkumar KalyanaramanRensselaer Polytechnic Institute

27

IP Multicast Architecture

Hosts

Routers

Service model

Host-to-router protocol(IGMP)

Multicast routing protocols

Page 28: IP: Addresses and Forwarding

Shivkumar KalyanaramanRensselaer Polytechnic Institute

28

Multicast RoutingBasic objective – build distribution tree for multicast packets

The “leaves” of the distribution tree are the subnets containing at least one group member (detected by IGMP)

Multicast service model makes it hardAnonymityDynamic join/leave

Page 29: IP: Addresses and Forwarding

Shivkumar KalyanaramanRensselaer Polytechnic Institute

29

Simple Mcast Routing TechniquesFlood and prune

Begin by flooding traffic to entire networkPrune branches with no receiversExamples: DVMRP, PIM-DMUnwanted state where there are no receivers

Link-state multicast protocolsRouters advertise groups for which they have receivers to entire networkCompute trees on demandExample: MOSPFUnwanted state where there are no senders

Page 30: IP: Addresses and Forwarding

Shivkumar KalyanaramanRensselaer Polytechnic Institute

30

How to Flood Efficiently ?A router forwards a packet from source (S) iff it arrives via the shortest path from the router back to S

Reverse path check!Packet is replicated out all but the incoming interfaceReverse shortest paths easy to compute just use info in DV routing tables

DV gives shortest reverse pathsEfficient if costs are symmetric xx yy

tt

SS

a

zz

Forward packets that arriveon shortest path from “t” to “S” (assume symmetric routes)

Page 31: IP: Addresses and Forwarding

Shivkumar KalyanaramanRensselaer Polytechnic Institute

31

Problem

xx yy

zz

SS

a

b

duplicate packet

Flooding can cause a given packet to be sent multiple times over the same link: can filter better than this!

Solution: Reverse Path Broadcasting

Page 32: IP: Addresses and Forwarding

Shivkumar KalyanaramanRensselaer Polytechnic Institute

32

Reverse Path Broadcasting (RPB)Basic idea: forward a packet from S only on child links for SChild link of router x for source S: link that has x as parent on the shortest path from the link to S

xx yy

zz

SS

a

b

5 6

child link of xfor S

forward onlyto child link

Page 33: IP: Addresses and Forwarding

Shivkumar KalyanaramanRensselaer Polytechnic Institute

33

How to Find Child Links?Routing updates !

If z tells x that it can reach S through y, and if the cost of this path is >= the cost of the path from z to S through x, then x knows that the link to z is a child link

In case of tie, lower address wins

xx yy

zz

SS

a

b

5 6

child link of xfor S

forward onlyto child link

Page 34: IP: Addresses and Forwarding

Shivkumar KalyanaramanRensselaer Polytechnic Institute

34

ProblemThis is still a broadcast algorithm – the traffic goes everywhere – lousy filtering!First order solution:

Truncated RPB

Page 35: IP: Addresses and Forwarding

Shivkumar KalyanaramanRensselaer Polytechnic Institute

35

Truncated RPBDon't forward traffic onto networks with no receivers1. Identify leaves (see below)2. Detect group membership in leaf (IGMP)Identify leaves

Leaf links are the child links that no other router uses to reach source SUse periodic updates of form:

“this is my next-link to source S”If child is not the “next-link” for anyone, it is a leaf

Page 36: IP: Addresses and Forwarding

Shivkumar KalyanaramanRensselaer Polytechnic Institute

36

Reverse Path Multicast (RPM)

xx yy

tt

SS

vv bbaa

aa bb

data messageprune message

Prune back transmission so that only absolutely necessary links carry trafficUse on-demand pruning so that router group state scales with number of active groups

Page 37: IP: Addresses and Forwarding

Shivkumar KalyanaramanRensselaer Polytechnic Institute

37

Basic RPM IdeaPrune (Source,Group) at leaf if no members

Send Non-Membership Report (NMR) up the treeIf all children of router R prune (S,G)

Propagate prune for (S,G) to parent ROn timeout:

Prune droppedFlow is reinstatedDown stream routers re-pruneNote: this is a soft-state approach

Grafting: Explicitly reinstate sub-tree when IGMP detects new members at leaf, or when a child asks for a graft.

Page 38: IP: Addresses and Forwarding

Shivkumar KalyanaramanRensselaer Polytechnic Institute

38

Putting it together: Topology

G G

S

G

Page 39: IP: Addresses and Forwarding

Shivkumar KalyanaramanRensselaer Polytechnic Institute

39

Flood with Truncated BroadcastG G

S

G

Page 40: IP: Addresses and Forwarding

Shivkumar KalyanaramanRensselaer Polytechnic Institute

40

PruningG G

S

Prune (s,g)

Prune (s,g)

G

Page 41: IP: Addresses and Forwarding

Shivkumar KalyanaramanRensselaer Polytechnic Institute

41

Grafting

Graft (s,g)

Graft (s,g)

G G

S

G

G

Report (g)

Page 42: IP: Addresses and Forwarding

Shivkumar KalyanaramanRensselaer Polytechnic Institute

42

After Grafting CompleteG G

S

G

G

Page 43: IP: Addresses and Forwarding

Shivkumar KalyanaramanRensselaer Polytechnic Institute

43

Distance-Vector Multicast RoutingDVMRP consists of two major components:

A conventional distance-vector routing protocol (like RIP) A protocol for determining how to forward multicast packets, based on the unicast routing table

DVMRP router forwards a packet ifThe packet arrived from the link used to reach the source of the packet

Reverse path forwarding check – RPFPacket forwarded ONLY to child linksIf downstream links have not pruned the tree

Page 44: IP: Addresses and Forwarding

Shivkumar KalyanaramanRensselaer Polytechnic Institute

44

DVMRP limitationsLike distance-vector protocols, affected by count-to-infinity and transient looping

Multicast trees more vulnerable than unicast !Shares the scaling limitations of RIP. New scaling limitations:

(S,G) state in routers: even in pruned parts!Broadcast-and-prune has an initial broadcast.Limited to few senders. Many small groups also undesired. Why ?

No hierarchy: flat routing domain

Page 45: IP: Addresses and Forwarding

Shivkumar KalyanaramanRensselaer Polytechnic Institute

45

Multicast Backbone (MBone)An overlay network of IP multicast-capable routers using DVMRPTools: sdr (session directory), vic, vat, wb

Host/routerMBone routerPhysical linkTunnelPart of MBone

R R

R

H R

H

R

RH

Page 46: IP: Addresses and Forwarding

Shivkumar KalyanaramanRensselaer Polytechnic Institute

46

MBone Tunnels

IP header,dest = unicast

IP header,dest = multicast

Transport headerand data…

A method for sending multicast packets through multicast-ignorant routersIP multicast packet is encapsulated in a unicast IP packet (IP-in-IP) addressed to far end of tunnel:

Tunnel acts like a virtual point-to-point linkIntermediate routers see only outer header Tunnel endpoint recognizes IP-in-IP (protocol type = 4) and de-capsulates datagram for processing

Each end of tunnel is manually configured withunicast address of the other end

Page 47: IP: Addresses and Forwarding

Shivkumar KalyanaramanRensselaer Polytechnic Institute

47

Simple Routing Techniques (recall)Flood and prune

Begin by flooding traffic to entire networkPrune branches with no receiversExamples: DVMRP, PIM-DMUnwanted state where there are no receivers

Link-state multicast protocolsRouters advertise groups for which they have receivers to entire networkCompute trees on demandExample: MOSPFUnwanted state where there are no senders

Page 48: IP: Addresses and Forwarding

Shivkumar KalyanaramanRensselaer Polytechnic Institute

48

Multicast OSPF (MOSPF)

Extend OSPF to support multicastMulticast-capable routers flag link state routing advertisementsLink-state packets include multicast group addresses to which local members have joinedRouting algorithm augmented to compute shortest-path distribution tree from a source to any set of destinations

Page 49: IP: Addresses and Forwarding

Shivkumar KalyanaramanRensselaer Polytechnic Institute

49

MOSPF: Example

Source 1

Receiver 1

Receiver 2

Z

W

Q

T

Page 50: IP: Addresses and Forwarding

Shivkumar KalyanaramanRensselaer Polytechnic Institute

50

Link Failure/Topology Change

Source 1

Receiver 1

Receiver 2

Z

W

Q

T

X

Page 51: IP: Addresses and Forwarding

Shivkumar KalyanaramanRensselaer Polytechnic Institute

51

Group Membership Change

Source 1

Receiver 1

Receiver 2

Z

W

Q

T

Receiver 3

Page 52: IP: Addresses and Forwarding

Shivkumar KalyanaramanRensselaer Polytechnic Institute

52

MOSPF: Impact on Route Computation

Can’t pre-compute all source multicast treesCompute tree on-demand when first packet from a source S to a group G arrivesNew link-state advertisement

May lead to addition or deletion of outgoing interfaces if it contains different group addressesMay lead to re-computation of entire tree if links are changed

Page 53: IP: Addresses and Forwarding

Shivkumar KalyanaramanRensselaer Polytechnic Institute

53

Routing Techniques (taxonomy)Tree building methods:

Data-driven: calculate the tree only when the first packet is seen. Eg: DVMRP, MOSPFControl-driven: Build tree in background before any data is transmitted. Eg: CBT

Join-styles:Explicit-join: The leaves explicitly join the tree. Eg: CBT, PIM-SMImplicit-join: All subnets are assumed to be receivers unless they say otherwise (eg via tree pruning). Eg: DVMRP, MOSPF

Page 54: IP: Addresses and Forwarding

Shivkumar KalyanaramanRensselaer Polytechnic Institute

54

Shared vs. Source-based TreesSource-based trees

Separate shortest path tree for each sender(S,G) state at intermediate routersEg: DVMRP, MOSPF, PIM-DM, PIM-SM

Shared treesSingle tree shared by all membersData flows on same tree regardless of sender(*,G) state at intermediate routersEg: CBT, PIM-SM

Page 55: IP: Addresses and Forwarding

Shivkumar KalyanaramanRensselaer Polytechnic Institute

55

Source-based Trees

RouterSourceReceiver

SR

R

R

R

R

S

S

Page 56: IP: Addresses and Forwarding

Shivkumar KalyanaramanRensselaer Polytechnic Institute

56

A Shared Tree

RPRP

RouterSourceReceiver

S

S

SR

R

R

R

R

Page 57: IP: Addresses and Forwarding

Shivkumar KalyanaramanRensselaer Polytechnic Institute

57

Shared vs. Source-Based TreesSource-based trees

Shortest path trees – low delay, better load distributionMore state at routers (per-source state)Efficient in dense-area multicast

Shared treesHigher delay (bounded by factor of 2), traffic concentrationChoice of core affects efficiencyPer-group state at routersEfficient for sparse-area multicast

Page 58: IP: Addresses and Forwarding

Shivkumar KalyanaramanRensselaer Polytechnic Institute

58

Core-based Routing ProtocolsSpecify “meeting place” aka “core” or “rendezvous point (RP)”Sources send initial packets to coreReceivers join group at coreRequires mapping between multicast group address and “meeting place” Examples: CBT, PIM-SM

Page 59: IP: Addresses and Forwarding

Shivkumar KalyanaramanRensselaer Polytechnic Institute

59

Protocol Independent Multicast (PIM) Support for both shared and per-source treesDense mode (per-source tree)

Similar to DVMRPSparse mode (shared tree)

Core = rendezvous point (RP)Independent of unicast routing protocol

Just uses unicast forwarding table

Page 60: IP: Addresses and Forwarding

Shivkumar KalyanaramanRensselaer Polytechnic Institute

60

PIM Protocol Overview

Basic protocol stepsRouters with local members Join toward Rendezvous Point (RP) to join shared treeRouters with local sources encapsulate data in Register messages to RP Routers with local members may initiate data-driven switch to source-specific shortest path trees

PIM v.2 Specification (RFC2362)

Page 61: IP: Addresses and Forwarding

Shivkumar KalyanaramanRensselaer Polytechnic Institute

61

PIM Example: Build Shared Tree

Source 1

Receiver 1

Receiver 2

(*,G)

Receiver 3

(*,G)

(*,G)

(*,G)

(*,G)

(*,G)

Join messagetoward RP

Shared tree after R1,R2 join

RP

Page 62: IP: Addresses and Forwarding

Shivkumar KalyanaramanRensselaer Polytechnic Institute

62

Data Encapsulated in Register

Source 1

Receiver 1

Receiver 2

(*,G)

Receiver 3

(*,G)

(*,G)

(*,G)

(*,G)

(*,G)

Unicast encapsulated data packet to RP in Register

RP

RP de-capsulates, forwards down shared tree

Page 63: IP: Addresses and Forwarding

Shivkumar KalyanaramanRensselaer Polytechnic Institute

63

RP Send Join to High Rate Source

Source 1

Receiver 1

Receiver 2 Receiver 3

(S1,G)

RP

Join messagetoward S1

Shared tree

Page 64: IP: Addresses and Forwarding

Shivkumar KalyanaramanRensselaer Polytechnic Institute

64

Build Source-Specific Distribution Tree

Source 1

Receiver 1

Receiver 2 Receiver 3

Join messages

Shared Tree

RP

Build source-specific tree for high data rate source

(S1,G),(*,G)

(S1, G)

(S1,G),(*,G)(S1,G),(*,G)

Page 65: IP: Addresses and Forwarding

Shivkumar KalyanaramanRensselaer Polytechnic Institute

65

Forward On “Longest-match” Entry

Source 1

Receiver 1

Receiver 2 Receiver 3

Source 1 Distribution Tree

Shared Tree

RP

(S1,G),(*,G)

(S1, G)

(S1,G),(*,G)(S1,G),(*,G)

Source-specific entry is “longer match” for source S1 than is Shared tree entry that can be used by any source

(*, G)

Page 66: IP: Addresses and Forwarding

Shivkumar KalyanaramanRensselaer Polytechnic Institute

66

Prune S1 off Shared Tree

Prune S1 off shared tree where if S1 and RP entries differ

Source 1

Receiver 1

Receiver 2 Receiver 3

Source 1 Distribution Tree

Shared Tree

RP

Prune S1

Page 67: IP: Addresses and Forwarding

Shivkumar KalyanaramanRensselaer Polytechnic Institute

67

Multicast Address AllocationUnicast IP addresses

Allocated statically to hostsAllocated hierarchically by hosts’ org

Multicast IP addressesAllocated per sessionGroups cross admin boundaries

Page 68: IP: Addresses and Forwarding

Shivkumar KalyanaramanRensselaer Polytechnic Institute

68

Address Allocation SolutionsCentral server?

Blocks of addresses for internal groupsLeases for global addresses

Random allocation?Send to group to ask if address is in useNeed conflict resolution protocol

Page 69: IP: Addresses and Forwarding

Shivkumar KalyanaramanRensselaer Polytechnic Institute

69

Multicast Address-Set Claim (MASC)Address allocation goals

Efficient utilization of address spaceAggregation of routing entriesRobust and scalable

Distributed, collaborative allocationDynamic claim and listen protocolClaim “prefixes” from parentCommunicate prefixes to local addresssallocation

Page 70: IP: Addresses and Forwarding

Shivkumar KalyanaramanRensselaer Polytechnic Institute

70

MASC

Domain A224.0.0.0/16

Domain B224.0.128.0/24

Domain C224.0.1.0/24

Domain F Domain G

A1 A1

B1

B2

F1

C1

C2

G1

Page 71: IP: Addresses and Forwarding

Shivkumar KalyanaramanRensselaer Polytechnic Institute

71

Prefix SelectionCollect available prefixesRemove those that have been claimedRandomly select one of remaining prefixes with shortest mask (largest range)Claim first sub-prefix of desired size

Page 72: IP: Addresses and Forwarding

Shivkumar KalyanaramanRensselaer Polytechnic Institute

72

Reliable Multicast: The GoalImplement reliability on top of IP multicast

Don’t necessarily care about ordering or byte-stream abstraction

Why is this hard ?Sender cannot keep state for unknown number of dynamic receivers

Remember open & dynamic group semantic?Algorithms like TCP that estimate path properties such as RTT and congestion window don’t generalize to trees.

Remember: TCP is only for a unicast session!Has to address (N)ACK implosions

Page 73: IP: Addresses and Forwarding

Shivkumar KalyanaramanRensselaer Polytechnic Institute

73

ImplosionPacket 1 is lost All 4 receivers request a resend

R1

S

R3 R4

R2

21

R1

S

R3 R4

R2

Resend request

Page 74: IP: Addresses and Forwarding

Shivkumar KalyanaramanRensselaer Polytechnic Institute

74

RetransmissionRe-transmitter

Options: sender, other receiversHow to retransmit

Unicast, multicast, scoped multicast, retransmission group, …

Problem: retransmissions (aka repairs) may reach destinations that don’t require a retransmission

A.k.a “exposure” problemSolution: subcast the re-transmission only to receivers that need it.

Page 75: IP: Addresses and Forwarding

Shivkumar KalyanaramanRensselaer Polytechnic Institute

75

Why Subcast? Exposure problem…Packet 1 does not reach R1;Receiver 1 requests a resend

Packet 1 resent to all 4 receivers

R1

S

R3 R4

R2

21

1

Resend request

R1

S

R3 R4

R2

1

Resent packet

Page 76: IP: Addresses and Forwarding

Shivkumar KalyanaramanRensselaer Polytechnic Institute

76

Ideal Recovery ModelPacket 1 reaches R1 but is lost before reaching other Receivers

Only one receiver sends NACK to the nearest S or R with packet

S

R3 R4

R2

2

1

1

R1

S

R3 R4

R2

Resend request

1Resent packet

Repair sent only to those that need packet

R1

Page 77: IP: Addresses and Forwarding

Shivkumar KalyanaramanRensselaer Polytechnic Institute

77

Using the Routers

R1

S

R3 R4

R2

• Routers do transport level processing:

• Buffer packets• Combine ACKs• Send retransmissions

• Model solves implosion and exposure, but not scalable

• Violates end-to-end argument

NACK

RTX

Router

Page 78: IP: Addresses and Forwarding

Shivkumar KalyanaramanRensselaer Polytechnic Institute

78

Reliable Multicast Transport: IssuesRetransmission can make reliable multicast as inefficient as replicated unicast

(N)ACK-implosion if all destinations ack at once“Crying baby”: a bad link affects entire group

Heterogeneity: receivers, links, group sizesAnonymous/Open/Dynamic Group Model:

Source does not know # of destinations, and destinations may vanish

Multicast applications do not need strong reliability of the type provided by TCP.

Can tolerate some reordering, delay, etc

Page 79: IP: Addresses and Forwarding

Shivkumar KalyanaramanRensselaer Polytechnic Institute

79

RMT building blocks: RFC 3048NACK only: Eg: SRM uses only end-to-end mechanisms. Tree-based ACK: aggregators reduce reverse traffic. Eg: RMTP-IIAsynchronous Layered Coding (ALC): use of forward-error correction (FEC), and no feedback, aka “proactive” FECRouter assist: use of NAKs but router support for aggregation. Eg: PGM

FEC retransmissions (aka reactive FEC) instead of data retransmissions

Page 80: IP: Addresses and Forwarding

Shivkumar KalyanaramanRensselaer Polytechnic Institute

80

Classes of RMT Protocols

•Ring Algorithm

•ACK & NACK

•Many to Many

•Lowest Scalability

•No Hierarchy

•Multicast NACK & FEC

•Simple to Implement

•Moderate Scalability

•Tree with Hard State

•ACK & NACK & FEC

•Confirmed Delivery

•Requires Configuration

Ring

RMP

Ring

RMPTRACK

RMTP-II

TRACK

RMTP-IINACK-Only

SRM

NACK-Only

SRM

Page 81: IP: Addresses and Forwarding

Shivkumar KalyanaramanRensselaer Polytechnic Institute

81

Classes of RMT Protocols

•One-level TRACK

•ACK & NACK

•Simple to Implement

•Non Real Time

File Transfer

MFTP

File Transfer

MFTP

•Tree with Soft State

•NACK & FEC

•Scaleable & Auto Config

•Req. New Router Code

Router-Assist

PGM

Router-Assist

PGMALC (FEC)

Digital Fountain

ALC (FEC)

Digital Fountain

•No Return Channel

•FEC Only

•Highest Scalability & Supports Hetergeneity

•Best for non real time or semi-reliable video

⌦ ⌫⌫

Page 82: IP: Addresses and Forwarding

Shivkumar KalyanaramanRensselaer Polytechnic Institute

82

SRMOriginally designed for wbReceiver-reliable

NACK-basedEvery member may multicast NACK or retransmission

Page 83: IP: Addresses and Forwarding

Shivkumar KalyanaramanRensselaer Polytechnic Institute

83

SRM Request SuppressionPacket 1 is resent; R2 and R3 no longer have to request a resend

Packet 1 is lost; R1 requests resend to Source and Receivers

R1

S

R3

R2

21

Delay varies by distance

X

Resend request

R1

S

R3

R2

1

X

X

Resent packet

Page 84: IP: Addresses and Forwarding

Shivkumar KalyanaramanRensselaer Polytechnic Institute

84

SRM: Stochastic Suppression

datad

d

d

d

Time

NACK

repair

2d

session msg

0

1

2

3

Delay = U[0,D2] ×dS,R

= Sender

= Repairer

= Requestor

Page 85: IP: Addresses and Forwarding

Shivkumar KalyanaramanRensselaer Polytechnic Institute

85

SRM: SummaryAll members get all the data that has been sent to the the multicast group (packet reliability )Repair requests/responses are multicast.Scope of repair requests/responses can be TTL limited or a separate “local recovery group” can be formedTechniques to avoid implosion of repair requests, and reduce control traffic: NAK backoff timersNACK/Retransmission suppression

Delay before sendingDelay based on RTT estimationDeterministic + Stochastic components

Periodic session messagesFull reliabilityEstimation of distance matrix among members

Page 86: IP: Addresses and Forwarding

Shivkumar KalyanaramanRensselaer Polytechnic Institute

86

RMTP: Fixed HierarchyRcvr unicasts periodic ACK to its Designated Receiver (DR)DR unicasts its own ACK to its parentRcvr chooses closest statically configured (DR)Mcast or unicastretransmission

Based on percentage of requestsScoped mcast for local recovery

D R2

R5

S

R3 R4

R1

R R R R

D

R R

D

D DR R Receiver R* Router

Page 87: IP: Addresses and Forwarding

Shivkumar KalyanaramanRensselaer Polytechnic Institute

87

RMTP: Comments+: Heterogeneity

Lossy link or slow receiver will only affect a local region

−: Position of DR criticalStatic hierarchy cannot adapt local recovery zone to loss points

Page 88: IP: Addresses and Forwarding

Shivkumar KalyanaramanRensselaer Polytechnic Institute

88

Pragmatic General Multicast Packet 1 resent to R2, R3, R4; Not resent to R1

R1

S

R3 R4

R2

21

1

X

1 1

1

Routers remember resend requests

Resend request

Packet 1 reaches only R1; R2, R3, R4 request resends

R1

S

R3 R4

R2

1

1 1

1

Resent packet

Page 89: IP: Addresses and Forwarding

Shivkumar KalyanaramanRensselaer Polytechnic Institute

89

ALC Protocol: Digital FountainInstead of using a single multicast group, split data over multiple groups

Example: Groups at 8, 16, 32, 64, 128, 256 KbpsReceivers can receive from 8 - 504 Kbps

Requires a way of splitting data upHierarchical video codecsFile transfer with Tornado FEC codes

Supports very large and heterogeneous groups, simple to implementRequires sparse mode routing protocols

Page 90: IP: Addresses and Forwarding

Shivkumar KalyanaramanRensselaer Polytechnic Institute

90

Multicast Congestion Control

R

R

R

S

???Mb/s

100Mb/s

100Mb/s

1Mb/s

1Mb/s

56Kb/s

R

What if receivers have very different bandwidths?Send at max?Send at min?Send at avg?

Page 91: IP: Addresses and Forwarding

Shivkumar KalyanaramanRensselaer Polytechnic Institute

91

Single Rate: Drop To Zero Problem

0%

10%

20%

30%

40%

50%

60%

70%

80%

90%

100%

- 10 20 30 40 50

Number of Receivers

Perc

enta

ge o

f Nom

inal

Thr

ough

put

0.01%0.10%1.00%10.00%

0%

10%

20%

30%

40%

50%

60%

70%

80%

90%

100%

0 10 20 30 40 50

Number of Receivers

Perc

enta

ge o

f Nom

inal

Thr

ough

put

1234681216

Above: Drop-towards-zero on a TCP Time Scale, for different loss rates

Right: Drop-towards-zero For different averaging periods

AssumptionsAll receivers have same

loss rate (worst case)All other assumptions

are optimistic, providing an upper bound

Page 92: IP: Addresses and Forwarding

Shivkumar KalyanaramanRensselaer Polytechnic Institute

92

Single Rate: TCP FriendlinessSingle Rate: TCP Friendliness

T C P T h ro u g h p u t E q u atio n s

1 0

1 0 0

1 ,0 0 0

1 0 ,0 0 0

1 0 0 ,0 0 0

0 .0 1 0 .1 1 1 0 1 0 0P ercen t L o ss

Kbp

s

S im p leT0= 3*RT0= 6*RT0= 12*RT0= 18*RFairness

Safety

Page 93: IP: Addresses and Forwarding

Shivkumar KalyanaramanRensselaer Polytechnic Institute

93

PGM Congestion Control (PGMCC)Receivers measure their loss and RTT The slowest receiver sends back an ACK on each data packet (I.e. ack clock)The sender runs TCP congestion control algorithms using these ACKsResults in faster responsiveness but somewhat high variation in send rateWorks well for small groups, or if only a single receiver at a time is congested

Page 94: IP: Addresses and Forwarding

Shivkumar KalyanaramanRensselaer Polytechnic Institute

94

Multi-Rate: Receiver AdaptationReceiver-driven Layered Multicast (RLM)

Layered video encodingEach layer uses its own mcast group

Receiver subscribes to max group that will get through with minimal dropsDynamically adapt to available capacity

Use packet losses as congestion signalOn congestion, receivers drop a layerOn spare capacity, receivers add a layerJoin experiments used for shared learning

Assume no special router supportPackets dropped independently of layer

Page 95: IP: Addresses and Forwarding

Shivkumar KalyanaramanRensselaer Polytechnic Institute

95

Layered Media Streams

S R

R1R2

R3

R10Mbps

10Mbps

512Kbps

128Kbps

10Mbps

R3 joins layer 1, fails at layer 2

R2 join layer 1,join layer 2 fails at layer 3

R1 joins layer 1,joins layer 2 joins layer 3

Page 96: IP: Addresses and Forwarding

Shivkumar KalyanaramanRensselaer Polytechnic Institute

96

RLM Join Experiment

Receivers periodically try subscribing to higher layerIf enough capacity, no congestion, no drops Keep layer (& try next layer)If not enough capacity, congestion, drops Drop layer (& increase time to next retry)What about impact on other receivers?

Page 97: IP: Addresses and Forwarding

Shivkumar KalyanaramanRensselaer Polytechnic Institute

97

Join Experiments

1

2

3

4

Time

Layer

Page 98: IP: Addresses and Forwarding

Shivkumar KalyanaramanRensselaer Polytechnic Institute

98

RLM Receiver CoordinationReceiver advertises intent to add layerOther receivers

Avoid conflicting experimentsIf experiment fails, will see increased drops => don’t try adding layer! (shared learning)OK to try adding lower layer during higher layer experiment

Won’t cause drops at higher layer!

Page 99: IP: Addresses and Forwarding

Shivkumar KalyanaramanRensselaer Polytechnic Institute

99

IP Multicast: Concerns

Deployment is difficult and slowISP’s reluctant to turn on IP Multicast

Open Group Model: Anyone can join a GroupInter-domain routing: MSDP doesn’t scaleAddress allocation is also complexPIM-SM requires a Rendezvous Point (RP): subject to attackMulticast tried to solve too many problems…

Page 100: IP: Addresses and Forwarding

Shivkumar KalyanaramanRensselaer Polytechnic Institute

100

Single-Source Multicast (SSM)

(S,G) (S’,G)

Source-specific channel (S,G)only S can send to Ganother source S’ must use a separate channel (S’,G)hosts join channels, so a member joining only (S,G) will NOT receive traffic from S’

Current infrastructure uses Any-Source Multicast (ASM)

any source can send to any group at any time

Page 101: IP: Addresses and Forwarding

Shivkumar KalyanaramanRensselaer Polytechnic Institute

101

Why SSM?Network Operator

trivial address allocation (16 million addresses per host)no network-layer source discovery (PIM RP and/or MSDP moved to the application layer)overcomes two significant obstacles to deployment

Content Providerexclusive access to multicast groups (no interruptions)permanent multicast groups (easy to advertise)provides better service

Page 102: IP: Addresses and Forwarding

Shivkumar KalyanaramanRensselaer Polytechnic Institute

102

Problem: How to find the Source?

source

source

Page 103: IP: Addresses and Forwarding

Shivkumar KalyanaramanRensselaer Polytechnic Institute

103

How to Find the Sources?broadcast everywhere

receivers decide when they do not want the traffic

any source multicast (ASM) [PIM-SM/MBGP/MSDP/IGMPv2]

use a rendezvous point (RP)receivers send joins along reverse path to RPsources send traffic to RP

source specific multicast (SSM) [PIM/MBGP/IGMPv3]require receivers to already know source(s)use some out-of-band mechanism

Page 104: IP: Addresses and Forwarding

Shivkumar KalyanaramanRensselaer Polytechnic Institute

104

How MSDP works with PIM-SM

RP

RP

RP

RP

MSDP peerPhysical link

A

B

C D

Receiver

Source

PIM messageMSDP message

SA

SA

SA

JoinJoinJoin

Join

Join

Page 105: IP: Addresses and Forwarding

Shivkumar KalyanaramanRensselaer Polytechnic Institute

105

How SSM Works

Physical link

A

B

C D

Receiver

Source

PIM message

JoinJoinJoin

Join

Join

Join

Page 106: IP: Addresses and Forwarding

Shivkumar KalyanaramanRensselaer Polytechnic Institute

106

SSM Advantages (cont’d)No RP, No need for MSDP All joins are (S,G), so no need for Class D address allocationReceivers find out about sources through out-of-band means (such as a web site)SSM-only implementations are much simpler than the full PIM-SM

No RP, No Bootstrap RP Election No Register state machineNo need to keep (*,G), (S,G,rpt) and (*,*,RP) stateNo (*,G) Assert State

Page 107: IP: Addresses and Forwarding

Shivkumar KalyanaramanRensselaer Polytechnic Institute

107

Application-level MulticastDo we really need network level multicast?

EfficiencyLogical rendezvous

Can we get efficiency by creatively using the actual members or a few infrastructure nodes?

Page 108: IP: Addresses and Forwarding

Shivkumar KalyanaramanRensselaer Polytechnic Institute

108

Application-level Multicast

S

R3 R4

R2

S

R3 R4

R2

Page 109: IP: Addresses and Forwarding

Shivkumar KalyanaramanRensselaer Polytechnic Institute

109

Stanford

CMU

Stan1

Stan2

Berk2

Overlay TreeGatech

Berk1

Berkeley

Gatech Stan1

Stan2CMU

Berk1

Berk2

Narada: End System Multicast

Page 110: IP: Addresses and Forwarding

Shivkumar KalyanaramanRensselaer Polytechnic Institute

110

Performance Concerns

Duplicate Packets:Bandwidth Wastage

CMU

Stan1

Stan2

Berk2

Gatech

Berk1

Delay from CMU to Berk1 increases

Stanford

Berkeley

Gatech Stan1

Stan2CMU

Berk1

Berk2

Page 111: IP: Addresses and Forwarding

Shivkumar KalyanaramanRensselaer Polytechnic Institute

111

Overlay TreeThe delay between the source and receivers is smallIdeally, the number of redundant packets on any physical link is lowHeuristic:

Every member in the tree has a small degree Degree chosen to reflect bandwidth of connection to Internet

Gatech

“Efficient” overlay

CMU

Berk2

Stan1

Stan2

Berk1Berk1

High degree (unicast)Berk2

Gatech

Stan2CMU

Stan1

Stan2

High latency

CMU

Berk2

Gatech

Stan1

Berk1

Page 112: IP: Addresses and Forwarding

Shivkumar KalyanaramanRensselaer Polytechnic Institute

112

Mesh

Berk2 Berk1

CMU

Gatech

Stan1Stan2

Advantages:Offers a richer topology robustness; don’t need to worry to much about failuresDon’t need to worry about cycles

Desired properties Members have low degreesShortest path delay between any pair of members along mesh is small

Page 113: IP: Addresses and Forwarding

Shivkumar KalyanaramanRensselaer Polytechnic Institute

113

Overlay TreesSource routed minimum spanning tree on meshDesired properties

Members have low degreeSmall delays from source to receivers

Berk2 GatechBerk1

Stan1Stan2

Berk2 Berk1

CMU

Gatech

Stan1Stan2

Page 114: IP: Addresses and Forwarding

Shivkumar KalyanaramanRensselaer Polytechnic Institute

114

Summary

IP multicast issues and applicationsMulticast over LANs and scopingIGMPMulticast Routing and MBONEReliable multicast transportsMulticast Congestion ControlSSM, Overlay Multicast