27
LISP Locator Reachability LISP Locator Reachability Algorithms Algorithms Dino Farinacci, Dave Meyer, Darrel Lewis, Vince Fuller, Andrew Partan, Noel Chiappa IETF Stockholm LISP Working Group July 2009

LISP Locator Reachability Algorithms

  • Upload
    keene

  • View
    60

  • Download
    5

Embed Size (px)

DESCRIPTION

LISP Locator Reachability Algorithms. Dino Farinacci, Dave Meyer, Darrel Lewis, Vince Fuller, Andrew Partan, Noel Chiappa IETF Stockholm LISP Working Group July 2009. Agenda. Problem Statement Observe Data Path Combinations Using TCP Heuristics “TCP-counts” Using data-plane echoing - PowerPoint PPT Presentation

Citation preview

Page 1: LISP Locator Reachability Algorithms

LISP Locator ReachabilityLISP Locator Reachability

Algorithms Algorithms

Dino Farinacci, Dave Meyer, Darrel Lewis, Vince Fuller, Andrew Partan, Noel Chiappa

IETF StockholmLISP Working Group

July 2009

Page 2: LISP Locator Reachability Algorithms

RLOC Reach AlgorithmsRLOC Reach Algorithms July 2009July 2009 Slide Slide 22

AgendaAgenda

• Problem Statement• Observe Data Path Combinations• Using TCP Heuristics

– “TCP-counts”

• Using data-plane echoing– “echo-nonces”

• All in Unison?• Implementation Report

Page 3: LISP Locator Reachability Algorithms

RLOC Reach AlgorithmsRLOC Reach Algorithms July 2009July 2009 Slide Slide 33

Problem StatementProblem Statement

• ITR A needs to know if RLOC B is reachable• ITR A needs to know when it can switchover to B’• ITR A cannot depend on a B-prefix route to determine if RLOC B is reachable

A’A’

AA

B’B’

BB

S D

?

Page 4: LISP Locator Reachability Algorithms

RLOC Reach AlgorithmsRLOC Reach Algorithms July 2009July 2009 Slide Slide 44

QuickTime™ and a decompressor

are needed to see this picture.

Problem StatementProblem Statement

• When ITR B sends to RLOC A, doesn’t mean that ITR A can reach RLOC B

• All you know is that RLOC B has not crashed but don’t know the forward-path from A -> B

A’A’

AA

B’B’

BB

S D?

Page 5: LISP Locator Reachability Algorithms

RLOC Reach AlgorithmsRLOC Reach Algorithms July 2009July 2009 Slide Slide 55

Problem StatementProblem Statement

• Loc-reach-bits from ITR B to RLOC A just tells A that RLOC B’ is not down

• Does not tell you that path from ITR A to RLOC B’ is reachable

A’A’

AA

B’B’

BB

S D

0x00000003

Page 6: LISP Locator Reachability Algorithms

RLOC Reach AlgorithmsRLOC Reach Algorithms July 2009July 2009 Slide Slide 66

Solution SpaceSolution Space• Have each ITR probe each ETR for every map-cache

entry– Can be done with control messaging– Can be piggybacked with data

• Use deep-packet-inspection heuristics• Can’t use a database for up/down status

– Reachability is relative to the source– Up/down status only useful when ETR is down– Up status tells you can test the path

• Send and pray– Use ICMP Unreachables to tell you path down status– But there is no ICMP mechanism to tell you when back up

Page 7: LISP Locator Reachability Algorithms

RLOC Reach AlgorithmsRLOC Reach Algorithms July 2009July 2009 Slide Slide 77

DilemmaDilemma• Need to detect quickly when RLOC is down

– So you can switchover fast

• Need to have recent up status for an RLOC– So you can switch to a working path

• Existence of a route to RLOC doesn’t give you up status– Must use a keepalive mechanism– Should have up status for using RLOC

• “N times M” control messaging doesn’t scale– Especially if you want to switchover fast– Tradeoff message overhead versus fast convergence

Page 8: LISP Locator Reachability Algorithms

RLOC Reach AlgorithmsRLOC Reach Algorithms July 2009July 2009 Slide Slide 88

Possible Data PathsPossible Data Paths• Totally symmetric

– ITR A and B see each other as up due to receipt of data– They could use piggyback keepalives to determine forward-path is up– But if there is no site-sourced data offered to ITRs they have to take a leap of faith the forward-path is

up

A’A’

AA

B’B’

BB

S D

Page 9: LISP Locator Reachability Algorithms

RLOC Reach AlgorithmsRLOC Reach Algorithms July 2009July 2009 Slide Slide 99

Possible Data PathsPossible Data Paths

• Source symmetric

A’A’

AA

B’B’

BB

S D

A’A’

AA

B’B’

BB

S D

• Return path symmetric

Same case - ITR sends to one RLOC but may only receive from other RLOC

Page 10: LISP Locator Reachability Algorithms

RLOC Reach AlgorithmsRLOC Reach Algorithms July 2009July 2009 Slide Slide 1010

Possible Data PathsPossible Data Paths• Totally asymmetric - “the square”

– Each xTR only has send *or* receive information– xTRs at a site don’t synchronize state– Piggyback keepalives can’t work– ITR A could request an echo in data-plane but B must reply with a control message– ITR A would have to keepalive with control message to B’ and B’ would reply with a control message

A’A’

AA

B’B’

BB

S D

Page 11: LISP Locator Reachability Algorithms

RLOC Reach AlgorithmsRLOC Reach Algorithms July 2009July 2009 Slide Slide 1111

DPI Mechanism - “TCP-DPI Mechanism - “TCP-counts”counts”

• Use a TCP connection setup heuristic– Specifically design for “the square”– ITRs count SYNs-sent and ACKs-sent for all connections– If ACKs are being sent, the return path from B’ to A’ is working *and* therefore path from A to B is working– If SYNs are sent but no ACKs then there is no return traffic– But A -> B could be working when B -> D, D -> B’, B’ -> A’, or A’ -> S is broken, in this case A should not switchover to B’– This mechanism gives you path up status but not good down status

A’A’

AA

B’B’

BB

S DSYN

SYN/ACK

ACK

SYN SYN

SYN/ACK SYN/ACK

ACKACK

Page 12: LISP Locator Reachability Algorithms

RLOC Reach AlgorithmsRLOC Reach Algorithms July 2009July 2009 Slide Slide 1212

Piggybacking - Echo Piggybacking - Echo NonceNonce

• Nonce in data packet– ITR requests ETR to echo nonce back– Request sets E-bit in data packet– Echo from ETR contains ITR’s nonce with E-bit clear– Tests if forward-path is up– Only works when symmetric (bidirectional traffic) between RLOC pairs– Detect down status via timeout of echo-nonce– Can be quicker convergence than control message keepalive as long as data is

sent from ITR to ETR

A’A’

AA

B’B’

BB

S D

E=1, nonce: 0x00123456

E=0, nonce: 0x00123456

Page 13: LISP Locator Reachability Algorithms

RLOC Reach AlgorithmsRLOC Reach Algorithms July 2009July 2009 Slide Slide 1313

• LISP Header Format• Documented in draft-ietf-lisp-03.txt

Piggybacking - Echo Piggybacking - Echo NonceNonce

Page 14: LISP Locator Reachability Algorithms

RLOC Reach AlgorithmsRLOC Reach Algorithms July 2009July 2009 Slide Slide 1414

Implementation - “TCP-Implementation - “TCP-counts”counts”

• Cost in memory– 2 integers per RLOC per map-cache entry

• Data-plane counts syns-sent and acks-sent during ITR encapsulation– Unilateral algorithm

• Every minute control-plane looks at counts– (!loc->syns_sent && !loc->acks_sent) -> RLOC is idle, leave in up state  – (!loc->syns_sent && loc->acks_sent)  -> RLOC is up, leave in up state – (loc->syns_sent && loc->acks_sent)   -> RLOC is up, leave in up state– (loc->syns_sent && !loc->acks_sent) -> RLOC went down, take down

• If down, bring up if packet received– Not square data path anymore

• If down, after 3 minutes, bring back up to start counting

Page 15: LISP Locator Reachability Algorithms

RLOC Reach AlgorithmsRLOC Reach Algorithms July 2009July 2009 Slide Slide 1515

Implementation - Echo Implementation - Echo NonceNonce

• Cost in memory– 6 integers per RLOC per map-cache entry– ITR to ETR direction (ETR is echoing)

(1) Next echo-nonce request to send(2) Last remote echo-nonce received(3) Packets receive count while in echo-nonce request

state(4) Timestamp when first entering echo-nonce request

state

– ETR to ITR direction (ITR is echoing)(5) Last remote echo-nonce request received(6) Next echo-nonce to send

Page 16: LISP Locator Reachability Algorithms

RLOC Reach AlgorithmsRLOC Reach Algorithms July 2009July 2009 Slide Slide 1616

Implementation - Echo NonceImplementation - Echo Nonce

• Handle collision– Both sides could be in echo-nonce request state at the

same time– They would never echo each other

• 2 mechanisms to avoid collision:– Use RLOC addressing as tiebreaker

• Force higher RLOC address to be in echo-nonce request state

• After lower RLOC address echos can enter echo-nonce request state

– Use nanosecond clock• Odd you keep requesting• Even you start echoing

Page 17: LISP Locator Reachability Algorithms

RLOC Reach AlgorithmsRLOC Reach Algorithms July 2009July 2009 Slide Slide 1717

• Every minute enter echo-nonce request state if data received in last minute– Exit when nonce is echoed or one minute has elapsed, former

stays up, later take down– If no packets received keep up– Wait for arrival of 10 packets (within the minute interval)

before checking if nonce was echoed

• If nonce sent in request does not match any echoed nonces, take RLOC down

• When down for 3 minutes, bring up and enter echo-nonce request state

• If down and receive packet, bring up and enter echo-nonce request state

Implementation - Echo Implementation - Echo NonceNonce

Page 18: LISP Locator Reachability Algorithms

RLOC Reach AlgorithmsRLOC Reach Algorithms July 2009July 2009 Slide Slide 1818

• When receiving an echo-nonce request– Within 15 seconds return echo for nonce– Continue returning echo up to 1 minute– Accept request regardless if RLOC is up or down

• Some good news– Implementation doesn’t do source RLOC lookup in data-

plane– Part of statistics processing allows echo-nonce state to be

conveyed from data-plane to control-plane

• Some bad news– Easy to explain at protocol level– Hard to implement due to a lot of control-plane/data-plane

interaction

Implementation - Echo Implementation - Echo NonceNonce

Page 19: LISP Locator Reachability Algorithms

RLOC Reach AlgorithmsRLOC Reach Algorithms July 2009July 2009 Slide Slide 1919

All in Unison?All in Unison?• Echo-nonce doesn’t make you depend on TCP• TCP-counts can help echo-nonce with the

asymmetric data paths• When one says down and the other says up, keep

RLOC up• When both say down, take RLOC down even when

loc-reach-bits say up - because path is down• When loc-reach-bits say down, take RLOC down• Conclusion

– Loc-reach-bits tell you when hard failure is close to RLOC– Echo-nonce and tcp-counts tell you about path failure

Page 20: LISP Locator Reachability Algorithms

RLOC Reach AlgorithmsRLOC Reach Algorithms July 2009July 2009 Slide Slide 2020

Unidirectional Data?Unidirectional Data?• When unidirectional data occurs among sites (to one or more ETRs)

– ITRs can’t tell if forward-path is up– Echo noncing won’t work– TCP-counts won’t work

• ITR must send a Map-Request to the RLOC– Used as a control message keepalive– Don’t spec this yet - maintain resistance

• This will be the default scenario for PTRs– Since it only encapsulates data packets

• When all priority 1 RLOCs are down and using priority 2 RLOCs– Can send Map-Request to priority 1 RLOCs to test path before using

them– Make-before-Break at the expense of control message overhead– Don’t spec this yet - maintain resistance

Page 21: LISP Locator Reachability Algorithms

RLOC Reach AlgorithmsRLOC Reach Algorithms July 2009July 2009 Slide Slide 2121

SummarySummary• The urge to get this 100% right will cause

scalability problems– Resist the urge

• Will the imperfection will be okay?– We are solving most of the problem

• With active-active multi-homing life is good– We’ll have symmetric paths

A’A’

AA

B’B’

BB

S D

Page 22: LISP Locator Reachability Algorithms

RLOC Reach AlgorithmsRLOC Reach Algorithms July 2009July 2009 Slide Slide 2222

How to ConfigureHow to Configure

• Defaults to neither enabledlisp loc-reach-algorithm {echo-nonce | count-tcp}

• Debug commanddebug lisp loc-reach-algorithm

• Show commandshow {ip | ipv6} lisp map-cache <eid-prefix>

• Supported in release dino-lisp-126

Page 23: LISP Locator Reachability Algorithms

RLOC Reach AlgorithmsRLOC Reach Algorithms July 2009July 2009 Slide Slide 2323

Sample OutputSample Output

• TCP-counts SYN and ACK counters240.23.0.0/16, uptime: 00:06:09, expires: 0.000000, via static State: complete, last modified: 00:06:09, map-source: local Locator Uptime State Priority/Weight Packets In/Out 1.22.23.23 00:06:09 up 1/50 16/17 Last up/down state change: 00:06:09 Last data packet in/out: 00:01:08/00:01:08 Last control packet in/out: never/never Last priority/weight change: never/never TCP-counts loc-reach algorithm: SYNs sent: 9, ACKs sent: 8 1.23.22.23 00:06:09 up 1/50 0/0 Last up/down state change: 00:06:09 Last data packet in/out: never/never Last control packet in/out: never/never Last priority/weight change: never/never TCP-counts loc-reach algorithm: SYNs sent: 2, ACKs sent: 0

RLOC staying up

RLOC going down

Page 24: LISP Locator Reachability Algorithms

RLOC Reach AlgorithmsRLOC Reach Algorithms July 2009July 2009 Slide Slide 2424

Sample OutputSample Output• When in echo-nonce request state

240.23.0.0/16, uptime: 00:44:49, expires: 0.000000, via static State: complete, last modified: 00:44:49, map-source: local Locator Uptime State Priority/Weight Packets In/Out 1.22.23.23 00:44:49 up 1/50 0/0 Last state change: 00:44:49 Last data packet in/out: never/never Last control packet in/out: never/never Last priority/weight change: never/never Echo-nonce loc-reach algorithm: Next request nonce to RLOC: 0x0049d55c Last echoed nonce from RLOC: 0x00000000 Packets from RLOC in echo-nonce request state: 0 Last request nonce from RLOC: 0x00000000 Next echo nonce to RLOC: 0x00000000 1.23.22.23 00:44:49 up 2/50 0/0 Last state change: 00:44:49 Last data packet in/out: never/never Last control packet in/out: never/never Last priority/weight change: never/never Echo-nonce loc-reach algorithm: Next request nonce to RLOC: 0x0049fb90 Last echoed nonce from RLOC: 0x00000000 Packets from RLOC in echo-nonce request state: 0 Last request nonce from RLOC: 0x00000000 Next echo nonce to RLOC: 0x00000000

Page 25: LISP Locator Reachability Algorithms

RLOC Reach AlgorithmsRLOC Reach Algorithms July 2009July 2009 Slide Slide 2525

Sample OutputSample Output• When echoing a nonce

240.23.0.0/16, uptime: 00:56:05, expires: 0.000000, via static State: complete, last modified: 00:56:05, map-source: local Locator Uptime State Priority/Weight Packets In/Out 1.22.23.23 00:56:05 up 1/50 1/2 Last state change: 00:56:05 Last data packet in/out: 00:00:03/00:00:03 Last control packet in/out: never/never Last priority/weight change: never/never Echo-nonce loc-reach algorithm: Next request nonce to RLOC: 0x0049d55c Last echoed nonce from RLOC: 0x00000000 Packets from RLOC in echo-nonce request state: 1 Last request nonce from RLOC: 0x00f7aa95 Next echo nonce to RLOC: 0x00f7aa95 1.23.22.23 00:56:05 up 2/50 0/0 Last state change: 00:56:05 Last data packet in/out: never/never Last control packet in/out: never/never Last priority/weight change: never/never Echo-nonce loc-reach algorithm: Next request nonce to RLOC: 0x0049fb90 Last echoed nonce from RLOC: 0x00000000 Packets from RLOC in echo-nonce request state: 0 Last request nonce from RLOC: 0x00000000 Next echo nonce to RLOC: 0x00000000

Page 26: LISP Locator Reachability Algorithms

RLOC Reach AlgorithmsRLOC Reach Algorithms July 2009July 2009 Slide Slide 2626

Big QuestionBig Question

• Can control message probing scale?

• Stay tuned

Page 27: LISP Locator Reachability Algorithms

RLOC Reach AlgorithmsRLOC Reach Algorithms July 2009July 2009 Slide Slide 2727