57
1 Diagnosing: Home Wireless & Wide-area Networks Partha Kanuparthy, Constantine Dovrolis Georgia Institute of Technology 1 Monday, February 13, 2012

Diagnosing: Home Wireless & Wide-area Networks

  • Upload
    others

  • View
    1

  • Download
    0

Embed Size (px)

Citation preview

Page 1: Diagnosing: Home Wireless & Wide-area Networks

1

Diagnosing:Home Wireless &

Wide-area NetworksPartha Kanuparthy, Constantine Dovrolis

Georgia Institute of Technology

1Monday, February 13, 2012

Page 2: Diagnosing: Home Wireless & Wide-area Networks

2

Two PartsDiagnosing home wireless networks [CCR’12]

Joint work between GT, Telefonica, CMU

Diagnosing wide-area networks [in-progress]

Joint work with Constantine Dovrolis

and a quick update on ShaperProbe

2Monday, February 13, 2012

Page 3: Diagnosing: Home Wireless & Wide-area Networks

3

Diagnosing Home Wireless

3Monday, February 13, 2012

Page 4: Diagnosing: Home Wireless & Wide-area Networks

4

Home 802.11 NetworksUbiquitous: most residential e2e paths start/end with 802.11 hopUse a shared channel across devices

infrastructure, half-duplex

Co-exist with neighborhood wireless and non-802.11 devices (2.4GHz cordless, Microwave ovens, ...)

4Monday, February 13, 2012

Page 5: Diagnosing: Home Wireless & Wide-area Networks

5

802.11 Performance Problems

5Monday, February 13, 2012

Page 6: Diagnosing: Home Wireless & Wide-area Networks

5

802.11 Performance Problems

Wireless clients see problems:

5Monday, February 13, 2012

Page 7: Diagnosing: Home Wireless & Wide-area Networks

5

802.11 Performance Problems

Wireless clients see problems:Low signal strength (due to distance, fading and multipath)

5Monday, February 13, 2012

Page 8: Diagnosing: Home Wireless & Wide-area Networks

5

802.11 Performance Problems

Wireless clients see problems:Low signal strength (due to distance, fading and multipath)Congestion (due to shared channel)

5Monday, February 13, 2012

Page 9: Diagnosing: Home Wireless & Wide-area Networks

5

802.11 Performance Problems

Wireless clients see problems:Low signal strength (due to distance, fading and multipath)Congestion (due to shared channel)Hidden terminals (no carrier sense)

5Monday, February 13, 2012

Page 10: Diagnosing: Home Wireless & Wide-area Networks

5

802.11 Performance Problems

Wireless clients see problems:Low signal strength (due to distance, fading and multipath)Congestion (due to shared channel)Hidden terminals (no carrier sense)Non-802.11 interference (microwave, cordless, ...)

5Monday, February 13, 2012

Page 11: Diagnosing: Home Wireless & Wide-area Networks

6

WLAN-ProbeWe diagnose 3 performance pathologies:

congestion, low signal strength, hidden terminals

Tool: WLAN-Probesingle 802.11 prober

user-level: works with commodity NICs

no special hardware or administrator requirements

6Monday, February 13, 2012

Page 12: Diagnosing: Home Wireless & Wide-area Networks

6

WLAN-ProbeWe diagnose 3 performance pathologies:

congestion, low signal strength, hidden terminals

Tool: WLAN-Probesingle 802.11 prober

user-level: works with commodity NICs

no special hardware or administrator requirements

6Monday, February 13, 2012

Page 13: Diagnosing: Home Wireless & Wide-area Networks

6

WLAN-ProbeWe diagnose 3 performance pathologies:

congestion, low signal strength, hidden terminals

Tool: WLAN-Probesingle 802.11 prober

user-level: works with commodity NICs

no special hardware or administrator requirements

6Monday, February 13, 2012

Page 14: Diagnosing: Home Wireless & Wide-area Networks

6

WLAN-ProbeWe diagnose 3 performance pathologies:

congestion, low signal strength, hidden terminals

Tool: WLAN-Probesingle 802.11 prober

user-level: works with commodity NICs

no special hardware or administrator requirements

6Monday, February 13, 2012

Page 15: Diagnosing: Home Wireless & Wide-area Networks

7

Life of 802.11 PacketDelays in a busy channel:

channel busy-wait delay

Delays in presence of bit errors:L2 retransmissions

random backoffs

Unavoidable variable delays:TX-delay(s) (based on L2 TX-rate)

802.11 ACK receipt delay

7Monday, February 13, 2012

Page 16: Diagnosing: Home Wireless & Wide-area Networks

7

Life of 802.11 PacketDelays in a busy channel:

channel busy-wait delay

Delays in presence of bit errors:L2 retransmissions

random backoffs

Unavoidable variable delays:TX-delay(s) (based on L2 TX-rate)

802.11 ACK receipt delay

7Monday, February 13, 2012

Page 17: Diagnosing: Home Wireless & Wide-area Networks

7

Life of 802.11 PacketDelays in a busy channel:

channel busy-wait delay

Delays in presence of bit errors:L2 retransmissions

random backoffs

Unavoidable variable delays:TX-delay(s) (based on L2 TX-rate)

802.11 ACK receipt delay

Usually implemented in NIC firmware

Can we measure these delays?

Yes!

7Monday, February 13, 2012

Page 18: Diagnosing: Home Wireless & Wide-area Networks

8

Access Delaybusy-wait re-TXs backoffs TX-delay ACKs

8Monday, February 13, 2012

Page 19: Diagnosing: Home Wireless & Wide-area Networks

8

Access Delaybusy-wait re-TXs backoffs TX-delay ACKs

rate adaptation!

8Monday, February 13, 2012

Page 20: Diagnosing: Home Wireless & Wide-area Networks

8

Access Delay

Captures channel “busy-ness” and channel bit errors

excludes 802.11 rate modulation effects

d = OWD - (TX delay)

busy-wait re-TXs backoffs TX-delay ACKs

rate adaptation!

first L2 transmission

8Monday, February 13, 2012

Page 21: Diagnosing: Home Wireless & Wide-area Networks

8

Access Delay

Captures channel “busy-ness” and channel bit errors

excludes 802.11 rate modulation effects

d = OWD - (TX delay)

busy-wait re-TXs backoffs TX-delay ACKs

rate adaptation!

first L2 transmission

??

8Monday, February 13, 2012

Page 22: Diagnosing: Home Wireless & Wide-area Networks

9

Access Delay: TX delay

d = OWD - (TX delay)TX-rate?

send 50-packet train with few tiny packets

use packet pair dispersion to get TX-rate:

current busy-wait delays

9Monday, February 13, 2012

Page 23: Diagnosing: Home Wireless & Wide-area Networks

10

Access Delay: noise?d = OWD - (TX delay)

10Monday, February 13, 2012

Page 24: Diagnosing: Home Wireless & Wide-area Networks

10

Access Delay: noise?Dispersion underestimates:

due to re-TXs, busy-waits, etc.

d = OWD - (TX delay)

10Monday, February 13, 2012

Page 25: Diagnosing: Home Wireless & Wide-area Networks

10

Access Delay: noise?Dispersion underestimates:

due to re-TXs, busy-waits, etc.

Insight: TX-rate typically remains same at timescales of a single train

d = OWD - (TX delay)

10Monday, February 13, 2012

Page 26: Diagnosing: Home Wireless & Wide-area Networks

10

Access Delay: noise?Dispersion underestimates:

due to re-TXs, busy-waits, etc.

Insight: TX-rate typically remains same at timescales of a single trainFind a single rate for the train!

d = OWD - (TX delay)

10Monday, February 13, 2012

Page 27: Diagnosing: Home Wireless & Wide-area Networks

11

Diagnosis

11Monday, February 13, 2012

Page 28: Diagnosing: Home Wireless & Wide-area Networks

12

Size-dependent Pathologies

Low signal strengthHidden terminals Congestion

Bit errors increase with packet size:Higher percentile access delays show trends.

12Monday, February 13, 2012

Page 29: Diagnosing: Home Wireless & Wide-area Networks

13

Hidden TerminalsHidden terminals respond to frame corruption

by random backoffs

Look at immediate neighbors of large delay or lost (L3) packets

hidden terminal: neighbor delays are small

low SNR: neighbors are similar

13Monday, February 13, 2012

Page 30: Diagnosing: Home Wireless & Wide-area Networks

14

Hidden TerminalsDefine two measures:

pu = P [ high delay or L3 loss ]

pc = P [ neighbor is high delay or L3 loss | high delay or L3 loss ]

Hidden terminal:pc ≈ pu

time

Acce

ss de

lay

t imeAc

cess

dela

y

Hiddenterminal(s)

Low SNR

14Monday, February 13, 2012

Page 31: Diagnosing: Home Wireless & Wide-area Networks

15

Hidden Terminals

Hidden terminal: pc ≈ pu Low SNR:

pc ≫ pu

15Monday, February 13, 2012

Page 32: Diagnosing: Home Wireless & Wide-area Networks

16

SummaryWLAN-Probe: tool for user-level diagnosis of 802.11 pathologies

Single 802.11 probing point

Commodity NICs

No kernel/admin-level changes

Extensions:wide-area probing for 802.11 diagnosis? (“M-Lab”)

passive (TCP) inference?

16Monday, February 13, 2012

Page 33: Diagnosing: Home Wireless & Wide-area Networks

17

Pythia:Detection, Localization, Diagnosis

of Wide-area Performance Problems

17Monday, February 13, 2012

Page 34: Diagnosing: Home Wireless & Wide-area Networks

18

Pythia:one tool, three objectives

Data analysis tool (e.g , perfSONAR data)Funded by DoEDetection: “noticeable loss rate between ORNL and SLAC on 07/11/11 at 09:00:02 EDT”

Localization“it happened at DENV-SLAC link”

Diagnosis“it was due to insufficient router buffers”

18Monday, February 13, 2012

Page 35: Diagnosing: Home Wireless & Wide-area Networks

19

Pythia: ApproachExisting diagnosis systems mine patterns and dependencies in large-scale network data (e.g., AT&T’s G-RCA)Can we use domain knowledge?

useful in inter-domain diagnosis where data is not available

Architecture: sensors do full-mesh measurements of network

central server computes and renders results

Infrastructure: perfSONAR (ESnet & Internet2)

19Monday, February 13, 2012

Page 36: Diagnosing: Home Wireless & Wide-area Networks

20

DetectionFirst step: “Is there a problem?”Look for deviations from baseline

Delay: nonparametric kernel density estimates to locate baselineLoss and reordering: empirical baseline estimates

NY-CLEV ALBU-ATLbaseline

2.5s rise!

20Monday, February 13, 2012

Page 37: Diagnosing: Home Wireless & Wide-area Networks

20

DetectionFirst step: “Is there a problem?”Look for deviations from baseline

Delay: nonparametric kernel density estimates to locate baselineLoss and reordering: empirical baseline estimates

NY-CLEV ALBU-ATLbaseline

2.5s rise!

Estimated eventsEvents / path / day

ESnet12 days, 33 monitors 933 0.1

Internet2 22 days, 9 monitors 2268 1.4

20Monday, February 13, 2012

Page 38: Diagnosing: Home Wireless & Wide-area Networks

21

DiagnosisFollow-up to detection:“What is the root cause?”Diagnosis types:

congestion types

routing effects

loss nature

reordering nature

end-host effects

21Monday, February 13, 2012

Page 39: Diagnosing: Home Wireless & Wide-area Networks

22

Congestion Nature

22Monday, February 13, 2012

Page 40: Diagnosing: Home Wireless & Wide-area Networks

22

Congestion Nature“Overload”: persistent queue build-up

22Monday, February 13, 2012

Page 41: Diagnosing: Home Wireless & Wide-area Networks

22

Congestion Nature“Overload”: persistent queue build-up“Bursty”: intermittent queues (high jitter)

22Monday, February 13, 2012

Page 42: Diagnosing: Home Wireless & Wide-area Networks

22

Congestion Nature“Overload”: persistent queue build-up“Bursty”: intermittent queues (high jitter)Very small buffer

22Monday, February 13, 2012

Page 43: Diagnosing: Home Wireless & Wide-area Networks

22

Congestion Nature“Overload”: persistent queue build-up“Bursty”: intermittent queues (high jitter)Very small bufferExcessive buffer

Overload:ESnet

Bursty:PlanetLab

Bursty:Home link

Excessivebuffer:

Home link

22Monday, February 13, 2012

Page 44: Diagnosing: Home Wireless & Wide-area Networks

23

Loss NatureRandom losses: (majority) losses do not correlate with high delaysOtherwise: non-random losses

23Monday, February 13, 2012

Page 45: Diagnosing: Home Wireless & Wide-area Networks

23

Loss NatureRandom losses: (majority) losses do not correlate with high delaysOtherwise: non-random losses

Random losses: Home link Non-random loss: ESnet

23Monday, February 13, 2012

Page 46: Diagnosing: Home Wireless & Wide-area Networks

24

End-host EffectsDelays and losses induced due to:

context switchesclock synchronization (NTP)others (e.g., PlanetLab virtualization)

24Monday, February 13, 2012

Page 47: Diagnosing: Home Wireless & Wide-area Networks

24

End-host EffectsDelays and losses induced due to:

context switchesclock synchronization (NTP)others (e.g., PlanetLab virtualization)

Internet2:context switch

PlanetLab:end-host

noise

24Monday, February 13, 2012

Page 48: Diagnosing: Home Wireless & Wide-area Networks

25

The Diagnosis TreeInput:

Detected Events(delay, loss, reordering)

25Monday, February 13, 2012

Page 49: Diagnosing: Home Wireless & Wide-area Networks

25

The Diagnosis TreeEnd-hosteffects

NTP vs.route events

Loss events Congestion

Reorderingnature

Input:Detected Events

(delay, loss, reordering)Not shown:

Unknown type

25Monday, February 13, 2012

Page 50: Diagnosing: Home Wireless & Wide-area Networks

26

Pythia In-ProgressMore performance problem types...Unsupervised clustering to identify unknown eventsOpen-source system implementation:

Detection, localization, diagnosis

Interfacing with data:

ESnet, I2, PL-testbed, broadband networks

Front-end for operators

26Monday, February 13, 2012

Page 51: Diagnosing: Home Wireless & Wide-area Networks

27

ShaperProbe: update

27Monday, February 13, 2012

Page 52: Diagnosing: Home Wireless & Wide-area Networks

28

The FCC 2011 reportFCC broadband study (2011) found that: “many cable service tiers exceed 100% of the advertised upstream rate”

We revisit this statement

FCC/SamKnows measured the sustained rate using a 30s TCP stream

If shaping kicks in after 25s, the sustained speed can’t be measured

28Monday, February 13, 2012

Page 53: Diagnosing: Home Wireless & Wide-area Networks

29

How long should we test?Capacity(Mbps)

Shaping rate (Mbps)

Burst duration (s)

Measured/sustained(%)

Comcastupstream

3.5 1 17 100

Comcastupstream

5 2 15, 31 100, 250Comcastupstream 9 5.5 26 163Comcastupstream

14.5 10 19 100

29Monday, February 13, 2012

Page 54: Diagnosing: Home Wireless & Wide-area Networks

29

How long should we test?Capacity(Mbps)

Shaping rate (Mbps)

Burst duration (s)

Measured/sustained(%)

Comcastupstream

3.5 1 17 100

Comcastupstream

5 2 15, 31 100, 250Comcastupstream 9 5.5 26 163Comcastupstream

14.5 10 19 100

Cox: FCC data

Capacity(Mbps)

Shaping rate

(Mbps)

Burst duration

(s)

Capacity/sustained(%)

Coxupstream 1.5 2 50 133

Measured

Advertised29Monday, February 13, 2012

Page 55: Diagnosing: Home Wireless & Wide-area Networks

30

Thank You!Diagnosing:

Home Wireless &Wide-area Networks

Partha Kanuparthy, Constantine DovrolisGeorgia Institute of Technology

30Monday, February 13, 2012

Page 56: Diagnosing: Home Wireless & Wide-area Networks

31

LocalizationFollow-up to detection:“Which link is bad?”Link/path performance levels discrete:e.g., high delay, medium delay, low delayLocalization: minimum number of bad links that can explain bad paths

use greedy heuristic to solve iteratively

31Monday, February 13, 2012

Page 57: Diagnosing: Home Wireless & Wide-area Networks

31

LocalizationFollow-up to detection:“Which link is bad?”Link/path performance levels discrete:e.g., high delay, medium delay, low delayLocalization: minimum number of bad links that can explain bad paths

use greedy heuristic to solve iteratively

path:CHIC to LOSA

path:ATLA to KANS

path:HOUS to LOSA

+

=

Internet2 event: 28th Feb 2011, 00:10:51 GMT

ge-6-2-0.0-rtr.KANS

ge-6-1-0.0-rtr.LOSA

31Monday, February 13, 2012