26
1 System Architecture Lab An Analysis of Fault Isolation in Multi-Source Multicast Session Network Research Workshop 2003. 8. 28 Heonkyu Park [email protected] Korea Advanced Institute of Science and Technology

System Architecture Lab 1 An Analysis of Fault Isolation in Multi-Source Multicast Session Network Research Workshop 2003. 8. 28 Heonkyu Park [email protected]

Embed Size (px)

Citation preview

Page 1: System Architecture Lab 1 An Analysis of Fault Isolation in Multi-Source Multicast Session Network Research Workshop 2003. 8. 28 Heonkyu Park hkpark@cosmos.kaist.ac.kr

1System Architecture Lab

An Analysis of Fault Isolationin Multi-Source Multicast Session

Network Research Workshop

2003. 8. 28

Heonkyu [email protected]

Korea Advanced Institute of Science and Technology

Page 2: System Architecture Lab 1 An Analysis of Fault Isolation in Multi-Source Multicast Session Network Research Workshop 2003. 8. 28 Heonkyu Park hkpark@cosmos.kaist.ac.kr

2System Architecture Lab

Table of Contents

1. Motivations / Problem Definition

2. Background

3. Analysis

4. Issues

5. Candidate Model

6. Simulation Results

7. Conclusion

References

Page 3: System Architecture Lab 1 An Analysis of Fault Isolation in Multi-Source Multicast Session Network Research Workshop 2003. 8. 28 Heonkyu Park hkpark@cosmos.kaist.ac.kr

3System Architecture Lab

Before we start…

• Terminology– Unicast : to a single receiver

– Multicast : to a specific subset of receiver• single-source : only one source in a session (one-to-many multicast)

• multi-source : many sources in a session (many-to-many multicast)

– Fault Detection– Fault Isolation

Fault Detection

Hmm… Fault is in somewhere…

: perceiving the fault in somewhere in the network

OK! I found the Fault!

Fault Isolation

: locating the fault that on-tree router or link which is the origin of a fault.

Page 4: System Architecture Lab 1 An Analysis of Fault Isolation in Multi-Source Multicast Session Network Research Workshop 2003. 8. 28 Heonkyu Park hkpark@cosmos.kaist.ac.kr

4System Architecture Lab

Motivation

1. Network monitoring is necessary to detect and discover of network problems.

2. Some participants in multicast experience severe packet loss.

3. Fault detection / isolation approaches in multicast are focused on single-source network.

Obtained using Rqm [rqm] tool

New model for fault isolation in multi-source multicast is needed.

/ Problem Definition

4. In multi-source multicast, little work has been done for fault isolation.

5. Straightforward reuse single-source solution is not sufficient for large number of multi-source multicast.

Page 5: System Architecture Lab 1 An Analysis of Fault Isolation in Multi-Source Multicast Session Network Research Workshop 2003. 8. 28 Heonkyu Park hkpark@cosmos.kaist.ac.kr

5System Architecture Lab

receiver

source send to a multicast session

receiver

receiver

receiver

Background1. IP Multicast2. Multi-Source Multicast Applications3. Challenges of Multicast Monitoring4. Needs for Multicast Fault Isolation

1. IP Multicast

Multicast Packets

When fault occur

routing path is changed when a fault is occurred.

Page 6: System Architecture Lab 1 An Analysis of Fault Isolation in Multi-Source Multicast Session Network Research Workshop 2003. 8. 28 Heonkyu Park hkpark@cosmos.kaist.ac.kr

6System Architecture Lab

Multi-Source Multicast Applications1. Networked virtual environments2. Synchronized resource like database updates3. Distributed or parallel concurrent processing4. Large-scale distributed military simulation5. Peer-to-peer multicast file transfer model6. Large-scale multimedia conference7. Large-scale replicated database8. Cooperative web cache protocols9. Shared editing and collaboration10. Interactive distance learning11. Network games or chatting12. and more…

Numberof Receivers

Number of Senders

Streaming

ContentDistribution

10 1,000

1

1,000,000

10

1,000

1,000,000

CollaborationTools

Games

Distributed Information

Systems

Peer-to-PeerApplications

Group Size [LN01]

Page 7: System Architecture Lab 1 An Analysis of Fault Isolation in Multi-Source Multicast Session Network Research Workshop 2003. 8. 28 Heonkyu Park hkpark@cosmos.kaist.ac.kr

7System Architecture Lab

Multicast Monitoring Tools [SA01]Management, Debugging and Modeling via Active / Passive Monitoring

Monitoring

Mah’sStudy

Yajnik’sStudy

Handley’sStudy

MINC *

* : can be used for active monitoring

Debugging ModelingManagement

mrmap mrinfo mrdebug rtpmon mtrace mwatch

mlisten

Dr. Watson

MultiMon mhealth RouteMonitor MantaRay NIMI *mantra sdr-mon Otter MRM * mwalkHPMM

mstatmviewmrtree

GDT NetIQ’s Chariot *

mmon

SNMP_NG

Time

~1992

~1997

~2000

recent research work

Page 8: System Architecture Lab 1 An Analysis of Fault Isolation in Multi-Source Multicast Session Network Research Workshop 2003. 8. 28 Heonkyu Park hkpark@cosmos.kaist.ac.kr

8System Architecture Lab

Needs for Multicast Fault Isolation

1. Monitoring of multicast network has become a crucial for maintaining the multicast operations– since the delivery service in multicast is more complex than in

traditional unicast networks

– Supervising multicast traffic is more difficult problem as each multicast tree involves multiple hosts with correlated, simultaneous faults.

2. There are various reasons causing multicast fault.– session announcement problem, reception problem, multicast

router problem, congestion and rate-limiting problems, multicast routing problem, etc. [TA00]

3. It is not easy work even in single-source multicast, to say nothing of multi-source multicast.

Page 9: System Architecture Lab 1 An Analysis of Fault Isolation in Multi-Source Multicast Session Network Research Workshop 2003. 8. 28 Heonkyu Park hkpark@cosmos.kaist.ac.kr

9System Architecture Lab

Analysis on Single-Source Approach Only for fault detection

1. MRM (Multicast Reachability Monitoring) [SA01]• active probing from a test sender(TS) to a test receiver(TR) by

MRM manager 2. SMRM (SNMP-Based MRM) [AT02]

• SNMP-based approach defined several MIB for multicast monitoring

Both detection and isolation3. HPMM (Hierarchical Passive Multicast Monitoring) [WL00]

• passive monitoring scheme that agents are organized in a hierarchy and communicate with each other using unicast

4. MTR (Fault Isolation in Multicast Tree) [RGE00]• receiver-driven method using IGMP multicast traceroute

Most approaches up to now focused on single-source multicast.

Page 10: System Architecture Lab 1 An Analysis of Fault Isolation in Multi-Source Multicast Session Network Research Workshop 2003. 8. 28 Heonkyu Park hkpark@cosmos.kaist.ac.kr

10System Architecture Lab

MRM (Multicast Reachability Monitor) [SA01] - Description

Step 2: TS Transmits

Step 1: Mgr Configures TS(s) and TR(s)

Step 3: TR(s) Monitor Group Transmission

Step 4: Mgr Collects and Displays TR Reports

Router End-Host Manager AgentCommunication

TS

TR2

TR1 MRMManager

TR3

R3

R1 R2

R4 R5

R6

TS: Test senderTR: Test Receiver

Page 11: System Architecture Lab 1 An Analysis of Fault Isolation in Multi-Source Multicast Session Network Research Workshop 2003. 8. 28 Heonkyu Park hkpark@cosmos.kaist.ac.kr

11System Architecture Lab

SMRM (SNMP-Based MRM) [AT00] - Description

smrmMIB Group in Extended MIB II

Page 12: System Architecture Lab 1 An Analysis of Fault Isolation in Multi-Source Multicast Session Network Research Workshop 2003. 8. 28 Heonkyu Park hkpark@cosmos.kaist.ac.kr

12System Architecture Lab

HPMM (Hierarchical Passive Multicast Monitor) [WL00] - Description

Foreigndomain 1

source 1 Foreigndomain 2

source 2

Localdomain

A

BC

D E

group 2group 1

• Each node knows exactly which upstream agent to notify in case of a fault occurrence.• Node D has only one parent for both multicast groups 1 and 2, which is node B• Node E defines a parent agent in B for group 1 and a parent agent in C for group 2.

1

1

1 1

2

2

2

Page 13: System Architecture Lab 1 An Analysis of Fault Isolation in Multi-Source Multicast Session Network Research Workshop 2003. 8. 28 Heonkyu Park hkpark@cosmos.kaist.ac.kr

13System Architecture Lab

MTR (Fault Isolation in Multicast Tree) [RGE00] - Description

Source

R b

R c

Source

R b

R aR c

Before

R a

Isolated Fault Region

After

common ancestor

router of Ra & Rc

: Fault

Page 14: System Architecture Lab 1 An Analysis of Fault Isolation in Multi-Source Multicast Session Network Research Workshop 2003. 8. 28 Heonkyu Park hkpark@cosmos.kaist.ac.kr

14System Architecture Lab

Comparison on Related Works

Active/ Passive

Single- source Multi-source

RemarksDetect Isolate Detect Isolate

MRM Active ○ △ test session

SMRM Passive ○ △ SNMP-based

HPMM Passive ○ ○ child-parent relationship

MTR Active ○ ○ IGMP mtrace

※ No suggested approaches are sufficient for fault isolation in multi-source multicast network.

Page 15: System Architecture Lab 1 An Analysis of Fault Isolation in Multi-Source Multicast Session Network Research Workshop 2003. 8. 28 Heonkyu Park hkpark@cosmos.kaist.ac.kr

15System Architecture Lab

Message Complexity of Current Approaches

1. Network overload exponentially increased by extending number of members• As extend member size, mtrace request packets and mtrace reply

packets are excessive.

2. Simulation result by ns-2• tree topology: 100 nodes, out-degree : 3

• number of members : 5 ~ 60 (increased by 5)

• 5 times average calculation

3. Thus, it needs different strategy to handle multi-source multicast fault detection and isolation.

5 10 15 20 25 30 35 40 45 50 55 600

50

100

150

200

250

300

number of members

ove

rlo

ad

x 1,000

Page 16: System Architecture Lab 1 An Analysis of Fault Isolation in Multi-Source Multicast Session Network Research Workshop 2003. 8. 28 Heonkyu Park hkpark@cosmos.kaist.ac.kr

16System Architecture Lab

Issues

1. Application Characteristics2. Message Complexity3. Fault Isolation Error

4. Scalability5. Deployment

Conferencing Application Broadcasting Application

Performance requirements

Require low latency and high bandwidth

interested in bandwidth, latency is not concern

Loss tolerate loss require reliable data delivery

Session length long lived, over 10 min short-lived

Group characteristics Dynamic and small groups relatively static

Source transmission patterns multiple sources a single static source

1. Application Characteristics

Comparison on two applications [CRSZ01]

Page 17: System Architecture Lab 1 An Analysis of Fault Isolation in Multi-Source Multicast Session Network Research Workshop 2003. 8. 28 Heonkyu Park hkpark@cosmos.kaist.ac.kr

17System Architecture Lab

Issues for Multi-Source Multicast Fault Isolation

1. Message Complexity– Message complexity will be main concern.– Not to increase linearly, but to logarithmic

• not O(N), but O(logN) or O(1)

2. Fault Isolation Error– Should be same or decreased compared to previous approach.– No sudden computation overload to isolate faults– near-realtime fault detection and isolation function

3. Scalability– not effected with the number of members– dynamic member action like join / leave actions

4. Deployment– should be easily deployable not depend on protocols and techniques.

size of members

messagecomplexity

not good

Acceptable

Page 18: System Architecture Lab 1 An Analysis of Fault Isolation in Multi-Source Multicast Session Network Research Workshop 2003. 8. 28 Heonkyu Park hkpark@cosmos.kaist.ac.kr

18System Architecture Lab

Candidate Model

• Goal : Isolate the fault promptly and accurately using efficient and scalable approach in the multicast network when the fault is occurred.

• Basic Idea : member grouping1. do not let all member send probe

2. there exists shared path from local member to other members

3. make maximum use of shared information

4. only group leader send probe for fault isolation to other group leaders

• Benefits1. reduce message complexity

2. scalable since not depend on size of members

Page 19: System Architecture Lab 1 An Analysis of Fault Isolation in Multi-Source Multicast Session Network Research Workshop 2003. 8. 28 Heonkyu Park hkpark@cosmos.kaist.ac.kr

19System Architecture Lab

Draft Model

1. Each group select a group leader.

2. Group leader manages its member and sends probes for fault isolation.

3. Not send to all other group leaders, but send just common ancestor router with other group leaders.

A1

A2A3 B1 B2

C1 D2

Group A Group B

Group C Group D

D1

Page 20: System Architecture Lab 1 An Analysis of Fault Isolation in Multi-Source Multicast Session Network Research Workshop 2003. 8. 28 Heonkyu Park hkpark@cosmos.kaist.ac.kr

20System Architecture Lab

Member Grouping

1. how the members are grouped– simply, boundary within border router

– need to find a way to make a group bigger since the number of group can be still large

2. how the members in a group know their group leader– group leader send a probe to group member

periodically “i-am-leader” packet

3. how know group leader exist– newly joined member send “i-am-leader” packet in a group using

multicast scoping

– if no response, it becomes the leader.

– if somebody send “i-am-leader” packet, consider there is a leader.

border router

one group

group leader

Page 21: System Architecture Lab 1 An Analysis of Fault Isolation in Multi-Source Multicast Session Network Research Workshop 2003. 8. 28 Heonkyu Park hkpark@cosmos.kaist.ac.kr

21System Architecture Lab

Group Leader Action Lists

1. Managing members in group– use “i-am-leader” to control group member

– “you-are-leader” packet when leave

2. Fault Isolation– primarily function for

group leader

– exchange among other group leaders

3. Group leader announcement– It is not easy work to

announce and to find out the group leaders

Page 22: System Architecture Lab 1 An Analysis of Fault Isolation in Multi-Source Multicast Session Network Research Workshop 2003. 8. 28 Heonkyu Park hkpark@cosmos.kaist.ac.kr

22System Architecture Lab

Simulation Results

10 20 30 40 50 60 70 80 90 1000

2000

4000

6000

8000

10000

12000

14000

16000

18000

number of source

me

ssa

ge

co

mp

lexi

ty

All-member-based Approach

Group-leader Approach

xy 176

• Overview– Simulated a simplified protocol using ns-2 simulator

– Random graph by GT-ITM

– Average value after five time simulations

– Compared with best approach among related works

• Results– All-member-based (best-performance)

– Group-leader-based

– reduced the message complexity 68%

xy 58

Page 23: System Architecture Lab 1 An Analysis of Fault Isolation in Multi-Source Multicast Session Network Research Workshop 2003. 8. 28 Heonkyu Park hkpark@cosmos.kaist.ac.kr

23System Architecture Lab

Conclusion

1. It is important to locate the fault in a network.

2. Little work has been done for fault isolation even in detection in multi-source multicast.

3. In multi-source multicast fault isolation, message complexity is main concern.

4. One candidate approach is a group-based architecture to locate the fault in a multi-source multicast session.

5. Simulation results show group-based approach reduced the message complexity as amount of 68% than the best performance approach among other ones.

6. However, group-based approach is not fully enough for scalability reason, etc.

Page 24: System Architecture Lab 1 An Analysis of Fault Isolation in Multi-Source Multicast Session Network Research Workshop 2003. 8. 28 Heonkyu Park hkpark@cosmos.kaist.ac.kr

24System Architecture Lab

Future Works

• Need more efficient approach for message complexity.

• Possible model is suppressed one-way probing mechanism.– Source sends a special packet to multicast group.

– All internal router records its routing information in the special packet.

– Without packet suppression, implosion problem will be occurred.

– Receiver compare to check whether routing path was changed.

0 200 400 600 800 1000 12000

2

4

6

8

10

12x 10

4

number of source

mess

age

com

ple

xity

Message complexity (r = 10)

No suppression Max SuppressionMin Suppression

• Simulation results show that this suppressed one-way probing is well suit for multi-source multicast network.

• Several things to elaborate…

• Any comment will be appreciated.

Page 25: System Architecture Lab 1 An Analysis of Fault Isolation in Multi-Source Multicast Session Network Research Workshop 2003. 8. 28 Heonkyu Park hkpark@cosmos.kaist.ac.kr

25System Architecture Lab

References[AT02] E. Al-Shaer and Y. Tang, “SMRM: SNMP-based multicast rechability monitoring,” i

n IEEE/IFIP Network Operations and Management Symposium (NOMS) 2002, Florence, Italy, April 2002.

[CRSZ01] Yang-hua Chu, Sanjay G. Rao, Srinivasan Seshan and Hui Zhang, “Enabling conferencing applications on the Internet using an overlay multicast architecture,” in ACM SIGCOMM 01, San Diego, California, August 2001.

[LN01] J. Liebeherr and M. Nahas, “Application-layer multicast with Delaunay Triangulations,” Global Internet Symposium, IEEE GlobeCom 2001, San Antonio, Texas, November 2001.

[RGE00] A. Reddy, R. Govindan and D. Estrin, “Fault isolation in multicast trees,” In Proceeding of ACM SigComm 2000, Stockholm, Sweden, Aug. 2000.

[Rqm] C. Perkins, “RTP Quality Matrix,” (RTP Quality Matrix), [online], http://www-mice.cs.ucl.ac.uk/multimedia/software/rqm/ (Accessed: 7 March 2003).

[SA01] K. Sarac and K. C. Almeroth, “Supporting multicast deployment efforts: A survey of tools of multicast monitoring,” Journal of High Speed Networking--Special Issue on Management of Multimedia Networking, vol. 9, num. 3/4, pp. 191-211, March 2001.

[TA00] D. Thaler, B. Aboba, “Multicast Debugging Handbook,” Internet draft, draft-ietf-mboned-mdh-*.txt, Internet Engineering Task Force (IETF), November 2000.

[WL00] J. Walz and B. N. Levine, “A hierarchical multicast monitoring scheme,” In 2nd International Workshop on Networked Group Communication, Nov. 2000.

Page 26: System Architecture Lab 1 An Analysis of Fault Isolation in Multi-Source Multicast Session Network Research Workshop 2003. 8. 28 Heonkyu Park hkpark@cosmos.kaist.ac.kr

26System Architecture Lab

Thank you.Question?

Comment?