Upload
russell-heilling
View
346
Download
1
Tags:
Embed Size (px)
Citation preview
IPv6 Neighbor Discovery
An IXP Perspective
Russell HeillingSenior Network [email protected]@xchewtoyx
We all understand ARP, right?
• Messages carried directly on EthernetEtherType 0x806
• Device sends broadcast requestWho has x.x.x.x?
• Receivers check target against local addresses
• If it matches they send a unicast reply
• Result is cached
All nodes on the network need to process all ARP Requests.High levels of ARP and you are going to have a bad day.
• Defined in http://tools.ietf.org/html/rfc4861
• Messages are carried within ICMPv6
• Includes:• Router and prefix discovery• Address resolution and neighbor unreachability detection• Redirect function
• Address resolution is most relevant from IXP perspective
IPv6 Neighbor Discovery
Router and prefix discovery
• The main point on RD: “Don’t do it on the exchange”
• We have seen an increase in the number of members sending RAs
• Please check your config and make sure you have it disabled
• We are improving our instrumentation and will be getting more proactive
• This is an MoU violation, and will result in a chase
• Analogous to ARP query message
“I know your IP, what’s your MAC?”
• ICMPv6 Type 135, Code 0.
• Can be sent unicast to refresh neighbor cache
• Can be multicast to discover uncached neighbors
• Uses last 24-bits of target address to construct multicast destinationTarget: 2001:7f8:4::1553:2Destination: ff02::1:ff53:2Group MAC: 33:33:ff:53:00:02
• RFC recommends no more than 1 solicitation per second per target
• Unicast solicitation used to refresh stale entry before removing
0 1 2 3 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | Type | Code | Checksum | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | Reserved | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | | + + | | + Target Address + | | + + | | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | Options ... +-+-+-+-+-+-+-+-+-+-+-+-
Neighbor Solicitation
Neighbor Advertisement
• Analogous to ARP reply message
• ICMPv6 Type 136, Code 0.
• R, S & O flags to indicate advertisement typeR & O flags outside scope here
• Can be sent unsolicited [S=0] (like gratuitous ARP)In which case uses all nodes multicast address
• IP source can be any address on same interface as target
0 1 2 3 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | Type | Code | Checksum | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ |R|S|O| Reserved | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | | + + | | + Target Address + | | + + | | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | Options ... +-+-+-+-+-+-+-+-+-+-+-+-
Unknown unicast
• VPLS is just a virtual switch – still needs to learn MAC addresses
• Ports going down immediately flush database entries causing short bursts of flooding while MAC is relearnt
• Unidirectional flows can result in longer term flooding if the destination ages out of the database
• Stale routes can direct traffic to unknown macs leading to extended flooding
• ARP can flush fdb entries on XOS (bug)
• We are investigating ways to better mitigate.
So why use multicast if it goes everywhere?
• A well designed NIC will filter in hardware
• ARP queries go to a single (broadcast) destination and will always need to be punted up the stack
• Neighbor solicitations are distributed over a large number of multicast groups. Most of them can be filtered out in hardware
More on NIC Filtering
• Ideally a NIC would have enough filter space for all subscribed groups
• Reality is that space is limited
• Different cards take different approaches
• Fallback to promiscuous mode• Promiscuous for all multicast• Hash the group address, accept any groups that hash to same value
• Caveat emptor. Know your hardware limits.
[linx-ops] LINX London Juniper LAN weirdness
• Nov 19th 2014 22:28 – Massive increase in non-unicast traffic
• Investigation shows member with fibre issue
• 2x10GE LAG, one link bouncing• Member router not happy, sending
massive numbers of neighbor solicitations
• Maxed out at around 3kp/s• Caused instability for a number of
other members
[linx-ops] LINX London Juniper LAN weirdness
• “IXPWatch” is good at spotting this for ARP
• Turns out not so good for IPv6 NS
• IPv6 NS stats were added to report easily
• Detection and alerting still has room for improvement
A note on addressing on LINX peering LANs
• LINX recommended IPv6 Address:2001:7f8:4:{LAN}::{ASN}:1/64
• LAN administered by LINX
• ASN converted to hex, not BCD
• Examples:
LINX (5459) on Juniper LAN2001:7f8:4::1553:1
LINX (8714) on IXCardiff2001:7f8:4:4::220a:1
So how does that work with Neighbor Solicitations?
• LINX recommended IPv6 Address2001:7f8:4:{LAN}::{ASN}:1/64
• Solicited nodes multicast address33:33:ff:{A}:00:01
• A is the low order octet of the ASN
• 5th byte is almost always zero
• 550+ unique member ASNs share 229 last octets
• Most group addresses match at least 2 members
• Some as high as 7
• Still much better than ARP
How busy is IPv6?
• Around 0.7% of traffic on Juniper LAN
• Follows very similar diurnal pattern to IPv4
• Not just BGP and monitoring – real traffic
How does ARP vs NS look?
wat?
There are more neighbor solicitations than ARP requests on the Juniper LAN
How do the distributions compare?
• Median interval between repeated ARP requests is 8s
• Median for NS is only 4s
• ARP intervals more distributed
• NS has strong peaks at 1s, 3-5s
• Smaller peak at approx 60s
ND may attempt to be more efficient than ARP, but it sure seems chatty
• Repeat offenders? Maybe…Top 5% of senders account for 34% of requests*
• Down neighbors?strong peak at 1s suggests retriesabout 80% of destinations down
• I think we have a winner…
* Based on analysis of peak hour flooded traffic
What is causing the difference?
Could we / Should we do something?
• Obvious reaction might be to suggest higher RETRANS_TIMER value
• Before jumping to that conclusion we should ask “Does it matter that there is more ND than ARP?”
• NS Addressing makes it easier for nodes to cope• Extending timer also makes unreachability detection slower