21
Study group 2012.04.09 Junction SHERLOCK IS AROUND: DETECTING NETWORK FAILURES WITH LOCAL EVIDENCE FUSION Qiang Ma 1 , Kebin Liu 2 , Xin Miao 1 , Yunhao Liu 1,2 1 Department of Computer Science and Engineering, Hong Kong University of Science and Technology 2 MOE Key Lab for Information System Security, School of Software, Tsinghua National Lab for Information Science and Technology, Tsinghua University 2012/04/09 1

Sherlock is around: Detecting Network failures with local evidence fusion

  • Upload
    annis

  • View
    41

  • Download
    0

Embed Size (px)

DESCRIPTION

Sherlock is around: Detecting Network failures with local evidence fusion. Study group 2012.04.09 Junction. Qiang Ma 1 , Kebin Liu 2 , Xin Miao 1 , Yunhao Liu 1,2 1 Department of Computer Science and Engineering, Hong Kong University of Science and Technology - PowerPoint PPT Presentation

Citation preview

Page 1: Sherlock is around:  Detecting Network failures with local evidence fusion

1

Study group2012.04.09Junction

SHERLOCK IS AROUND: DETECTING NETWORK FAILURES WITH LOCAL EVIDENCE FUSIONQiang Ma 1 , Kebin L iu 2 , X in Miao 1 , Yunhao Liu 1 , 2

1 Department of Computer Sc ience and Engineer ing, Hong Kong Univers i ty of Sc ience and Technology

2 MOE Key Lab for Information System Secur i ty , School o f Software,

Ts inghua Nat ional Lab for In formation Sc ience and Technology, Ts inghua Univers i ty

2012/04/09

Page 2: Sherlock is around:  Detecting Network failures with local evidence fusion

2

Motivations: Widely deployed WSNs for numerous application

Need to sustain for years, and operate reliably Error-prone and subject to component faults, performance

degradations It’s more challenging to explore the root causes for WSNs

Ad-hoc feature of WSNs: large-scale, dynamical changes of topology

Limit sources of sensor nodes: power, computation capability The existence of a large variety of specific protocols for WSNs

INTRODUCTION

2012/04/09

Page 3: Sherlock is around:  Detecting Network failures with local evidence fusion

3

Traditional/popular way of diagnosis process Sink-based

Actively collect global evidences from sensor nodes to the sink Remaining energy, MAC layer backoff, neighbor table, routing table …

Conduct centralized analysis at the powerful back-end Disadvantages

Communication overheadAvoid large overhead in evidence collection process

Self-diagnosis Injects fault inference model into sensor nodes Make local decisions

Disadvantages Results from single nodes: Inaccurate due to the narrow scope Inconsistent results from different inference processes

2012/04/09

RELATED WORKS

Page 4: Sherlock is around:  Detecting Network failures with local evidence fusion

4

Main Design Diagnosis efficiency

Local diagnosis process instead of backend Reduce communication overhead

Diagnosis accuracy Take judgments form all nodes with the local area into

consideration

2012/04/09

LOCAL DIAGNOSIS (LD2)

Page 5: Sherlock is around:  Detecting Network failures with local evidence fusion

5

Working like this: Nodes running NBC: *state attributes = evidences

Posterior probability distribution: P(root causes|evidences) Once a node detect anomalies

Construct a fusion tree and do evidence fusionAdvantages:

Balance the workload ensure a local consensus to the final diagnosis result

2012/04/09

SYSTEM ARCHITECTURENaïve Bayesian Classifier to encode the probability correlation between a set of state attributes and root causes

If its neighbor node has been removed from the neighbor list, the process would be triggered.

Dempster-Shafer TheoryTheory of evidence (DST)

Page 6: Sherlock is around:  Detecting Network failures with local evidence fusion

6

Parameters learned from historical data R: root cause; F i, where i=1,…,n: evidences; : store s discrete values Calculate the posterior probability

The posterior probabilities of different root causes Each node, based on F i observed, calculate the With certain mapping (normalization), Used later as the basic probability assignments in DST

2012/04/09

NAÏVE BAYESIAN CLASSIFIER (NBC)

Pre-learned

Scale factor: constant for different R

Page 7: Sherlock is around:  Detecting Network failures with local evidence fusion

7

Fundamentals Allow us to combine evidence from different sources and

arrive at a degree of belief in all possible states/hypotheses (R, root causes) that takes into account all the available evidences (F, metrics).

Terms: Hypotheses: The frame of discernment: basic probability/belief assignment: m

(subjective or objective) , A: focal element constraint:

*posterior probability (objective)

2012/04/09

DEMPSTER-SHAFER THEORY (DST)

Page 8: Sherlock is around:  Detecting Network failures with local evidence fusion

8

Different from the concept of probability Belief: Plausibility: Pl(s)=1-Bel(~s) Belief <= plausibility

In this study The frame of discernment , R i: root causes

RO: no problem Only generates

2012/04/09

DEMPSTER-SHAFER THEORY (DST)

Page 9: Sherlock is around:  Detecting Network failures with local evidence fusion

9

Combine the belief from different observers (sensor nodes) To do evidence fusion

conflict factor joint mass

Problem: The combination result goes against the practical sense!! When with low or high conflict factor

2012/04/09

DEMPSTER’S RULE OF COMBINATION

Page 10: Sherlock is around:  Detecting Network failures with local evidence fusion

10

Example: Hypotheses Ω = T, M, C

T: brain tumor M: meningitis ( 腦膜炎 ) C: concussion ( 腦震盪 )

The frame of discernment = 2Ω

2012/04/09

LOW/HIGH CONFLICT FACTOR

Doctor A 2Ω Doctor Bm(A1)=0.99

T m(B1)=0.99

m(A2)=0.01

M m(B2)=0

m(A3)=0 C m(B3)=0.01

Doctor A 2Ω Doctor Bm(A1)=0.99

T m(B1)=0

m(A2)=0.01

M m(B2)=0.01

m(A3)=0 C m(B3)=0.99

∩ A1

A2

A3

B1 Ø Ø ØB1 Ø M ØB1 Ø Ø Ø

m(T)=1!!

Page 11: Sherlock is around:  Detecting Network failures with local evidence fusion

11

Believe those results highly consensus between nodes Definition 1: the distance between m1 and m2 is

Where And

Proof:

2012/04/09

MODIFIED COMBINATION RULE

Page 12: Sherlock is around:  Detecting Network failures with local evidence fusion

12

Definition 2: The similar degree of m1 and m2 is

If we have one node i whose M i is similar to all the others, than we believe that this node’s M i is important.

Definition 3: The basic confidence of evidence i (i = 1,2,..,N)

Normalization: Modified = Basic probability assignment x basic confidence

Reduce the impact of those evidences with less importance

2012/04/09

MODIFIED COMBINATION RULE

Page 13: Sherlock is around:  Detecting Network failures with local evidence fusion

13

Criterion: the fusion result keeps the same even if we change the

fusion order Theorem 1:

2012/04/09

EVIDENCE FUSION

Page 14: Sherlock is around:  Detecting Network failures with local evidence fusion

14

Trigger node Detect abnormal symptoms

Node crash Traffic contention Route loop

Determine the diagnosis area ???

Standard set Reduce computation overhead root node and its one-hop neighbors

DREQ contains Establish the fusion tree Detail of diagnosis task Standard set => basic confidence

2012/04/09

FUSION ALGORITHM

Page 15: Sherlock is around:  Detecting Network failures with local evidence fusion

152012/04/09

EVIDENCE FUSION ALGORITHM

In case the loss of DREQ

Page 16: Sherlock is around:  Detecting Network failures with local evidence fusion

16

CitySee project: Urban carbon dioxide sensing 494 sensor nodes

Testbed using CTP protocol 50 TelosB motes

Comparison LD2 and TinyD2

Manually inject evidences Node crash Traffic contention Route loop

Metrics False negative rate v.s. False positive rate

2012/04/09

EVALUATION

Page 17: Sherlock is around:  Detecting Network failures with local evidence fusion

Fault detector (Self-diagnosis) Finite state machine (FSM) model Fault detector M=(E, S, S0, f, F)E: the set of input evidencesS: the set of statesS0: start statef: state transition functionF: all Accept states

E.g. high retransmission rate between A and B (A->B)

A finds rate increasing A broadcasts the current state

together with the fault detector If B received, check ACK or DATA B -> S2 and broadcast -> Ci NUM: threshold Bc: severe contention at B

2012/04/09 17

TINYD2 [1]

[1] Kebin Liu; Qiang Ma; Xibin Zhao; Yunhao Liu;"Self-diagnosis for large scale wireless sensor networks," INFOCOM, 2011

Accept states: final diagnosis

decision

Page 18: Sherlock is around:  Detecting Network failures with local evidence fusion

18

Problem node: 25 With 16 neighbors

Root node of fusion tree: 13 Time cost

Sampling evidences Assign local basic confidence

Establishing fusion tree Receive & broadcast beacons

2012/04/09

TIME COST

Time cost is stable for all the tree structures

Traffic contention with longer time cost;DEVI packet contains 3 possible root

causes:1. ingress overflow, 2. egress overflow 3.

bad link=> More combination work is needed

Page 19: Sherlock is around:  Detecting Network failures with local evidence fusion

192012/04/09

DIAGNOSIS ACCURACY

Decrease as neighbors increase:

More determinate diagnosis

TinyD2 performs unstable:Worse when neighbors

increase=> Fail to achieve a

consensus

TinyD2 performs unstable:Worse when neighbors

increase=> Fail to achieve a

consensus

Several root causes make it difficult for TinyD2 to use FSM to achieve an accept

stat

Page 20: Sherlock is around:  Detecting Network failures with local evidence fusion

202012/04/09

COUPLING EFFECT WITH APPLICATION

Application packet loss

Page 21: Sherlock is around:  Detecting Network failures with local evidence fusion

21

Conduct diagnosis in local area Reduce the communication overhead

Distribute the diagnosis workload to the sensor nodes within a diagnosis area

Use fusion tree to do evidence fusion A local consensus to the final diagnosis report is achieved

Need to predefine the failures!!

2012/04/09

CONCLUSION