[IEEE 2008 11th IEEE International Conference on Communication Technology (ICCT 2008) - Hangzhou, China (2008.11.10-2008.11.12)] 2008 11th IEEE International Conference on Communication

Distributed Fault Diagnosis of Wireless Sensor Networks*

Xianghua Xu Wanyong Chen Jian Wan Ritai Yu Grid and Services Computing Lab

School of Computer Science, Hangzhou Dianzi University Hangzhou 310018, China

[email protected], [email protected], [email protected], [email protected]

Abstract—In wireless sensor networks (WSN), faults occurring to sensor nodes are common due to the sensor device itself and the harsh deployment environment. In order to avoid degradation of service due to faults, it is necessary for the WSN to be able to detect faults early. In this paper we propose a localized fault diagnosis algorithm which executes in tree-like networks effectively. It is based on local comparisons of sensed data and dissemination of the test results to the remaining sensors. Furthermore, Times redundancy is used to diagnose the intermittent fault in sensing and communication. Simulation results show the proposed algorithm has high detection accuracy and low false alarm rate.

Keywords-Wireless sensor network; fault diagnosis; Distributed algorithm

I. INTRODUCTION

Recently wireless sensor networks are emerging as computing platforms for various applications such as environmental monitoring, security surveillance and target tracking. The tiny sensor nodes can easily be deployed into a designated area to form a wireless network and perform specific functions [1][2][3]. Due to the low cost and the deployment of a large number of sensor nodes in an uncontrolled or even harsh or hostile environment, it is not uncommon for the sensor nodes to become faulty and unreliable. The faulty data is negative for the networks: (1) it decreases the judgment accuracy of the base station; (2) It increases the traffic in the networks; (3) It wastes much limited energy [4].

The sensor nodes typically route measurements to the base station in a tree-like fashion (with the base station being at the root of the tree), the node can be capable of testing its parent sensor easily. Our algorithm adopts the judgment principle proposed in [5] to identify one of the sensors in the sub-tree, and then uses its result to judge the remaining sensors’ status through the parent-child relationship. Different from the algorithm in [5], we consider the intermittent fault and the communication cost.

The rest of the paper is organized in the following way. We first review the literature in the fault diagnosis area in Section

. Then, we define the network model and fault model in Section . A fault diagnosis scheme is proposed in Section

. After that, the simulate results are presented in Section .Finally, the paper is concluded in Section .

II. RELATED WORK

In this section, we briefly review the related works in the area of fault diagnosis in wireless sensor networks.

In [5], a distributed fault detection algorithm was proposed to locate the faulty sensors in the WSN. It calculates the measurement difference between neighbor sensors at different time to find if the current measurement of a sensor is different from its previous measurement. If the measurement changes over the time significantly, it is more likely the sensor is faulty. However the communication cost of the algorithm is high and it can’t diagnose the intermittent fault.

Using management architecture, a failure diagnosis scheme called MANNA [6] was proposed for WSN. The scheme created a manager, which has the global vision of the network, to perform complex tasks such as retrieving the node state and detect node failure. However, the centralized management and overhead communication may not realistic for many applications.

In [7], the taxonomy for classification of faults in sensor networks and the first on-line model-based testing technique was introduced. This approach can be applied on an arbitrary system of heterogeneous sensors with an arbitrary type of fault model. However the technique is centralized. It is up to the base station to collect sensor node information and conduct the on-line fault diagnosis.

Ding et al. [8] have proposed a faulty sensor identification algorithm which requires low computational overhead. Each sensor node compares its own sensed data with the median of neighbors’ data to determine its own status. The performance of the localized diagnosis, however, is limited due to the non-uniform nature of node degrees in sensor networks with random deployment. The paper also mentioned the need of sensors’ physical location, which requires expensive GPS or other techniques to realize.

Luo, et al. [9] discussed how to choose neighbor size and how to address both the noise-related measurement error and sensor fault simultaneously in fault-tolerant detection.

*This work is supported by Science and Technology Research and Development Program of Zhejiang Province, China. (Grant No. 2008C11100, 2007C11023, and 2007R40G2040097)

978-1-4244-2251-7/08/$25.00 ©2008 IEEE

However, they didn’t explicitly attempt to detect faulty sensors; instead, they proposed algorithms to improve the event detection accuracy in the presence of faulty sensors. One other shortcoming is that their proposed schemes are only for the binary decision situation.

III. MODEL

We assume sensors are randomly deployed in the interested area and all sensors have a common transmission range. The area is assumed to be entirely covered by the sensors. All sensor nodes typically route measurements to the base station in a tree-like fashion (with the base station being at the root of the tree) and they can communicate with their neighbors in the transmission range. As shown in Fig. 1, the black circles represent faulty sensors and the white circles are good sensors.

Figure 1 Sensor nodes randomly deployed and the fashion of routing measurements

In this paper, we assume all system software as well as the application software are already fault tolerant. We only consider the sensor faults which include three types of faults: calibration systematic error, random noise error, and complete malfunctioning. Nodes are still capable of receiving, sending, and processing when they are faulty. The goal of the algorithm is to identify them. Since we adopt the judgment principle proposed in [5] to identify faulty nodes in a special region, we assume that each sensor has at least 3 neighboring nodes. Because a large amount of sensors are cast into the interested area to form a wireless network, this condition can be easily obtained.

IV. LOCALIZED FAULT DIAGNOSIS

In this section, we will first give some definitions for the denotations. Then, the localized fault diagnosis algorithm in a tree-like fashion will be introduced.

A. Definitions Table 1 summarizes the notations we will use in our

discussion. Table 1 Summary of notations

Symbol Definition pNi

Probability of failure of a sensor The ith sensor in network

S(Ni)xinq

tijd

tijc

Rij

1 and 2

Ti

Set of the neighbors of NiMeasurement of sensor NiNumber of neighbor sensors Total times of measurement between xi and xjMeasurement difference between Ni and Nj

at the t times, tj

ti

tij xxd , q

Test between Ni and Nj at the t times, tijc

{0,1}, tijc = t

jic , qComparison result between Ni and Nj, Rij

{0,1}, Rij = RjiTwo predefined threshold values Tendency value of the Ni, Ti {LG(Likely Good), LF(Likely Fault), GD(Good), FT(Fault)}

Sensors are considered as neighboring sensors if they are within the transmission range of each other. Each node regularly sends its measured value to all its neighbors.

We are interested in the history data if more than half of the sensor’s neighbors have a significantly different value from it. We can collect sensing data q times (q>1) in order to find if the current measurement is different from previous measurement. If the measurements change over the time significantly, it is more likely the sensor is faulty.

A comparison result Ri is generated by sensor Ni based on its neighbor Nj’s measurements using two variables, dij and t

ijc ,

and two predefined threshold value 1 and 2 . If tijd < 1 ,

tijc = 0, conversely, t

ijc = 1. And when q1k 2

kijc , Rij = 0,

most likely either both Ni and Nj are good or both are faulty. Otherwise, if Rij = 1, Ni and Nj are most likely in different status.

Sensors can be either LG or LF, determined by using comparison value from its neighboring sensors. Each sensor sends its tendency value to all its neighbors. We use the judgment principle proposed in [5] to determine whether the sensors are GD or FT. That is )N(SN ij and Tj = LG,

)R21(R)R1( ijijij must be greater or equal to 2/)N(S i to claim Ni is good.

In our algorithm, different from [5], we just use the above process to test the sensors in the particular level of the tree in each round, and when a GD sensor is found, its test result can be used to diagnose other nodes in the sub-tree. The communication cost is lower than the algorithm proposed in [5], on the other hand, we collect sensing data q times in order to avoid the intermittent fault.

B. Algorithm In the algorithm, base station chooses one of its children at

random to start the diagnosis and at least one node is determined as GD, can this algorithm continue to execute. The proposed fault diagnosis algorithm can be depicted as follows (Algorithm).

Table 2 Algorithm for fault diagnosis Step 1: Each sensor Na tests its parent sensor Nb to generate comparison Rab{0,1} using the following method:

Set Rab = 0 and compute tabd ;

IF | tabd | > 1 THEN t

abc = 1;

ELSE tabc = 0;

IF q1k 2

kabc , THEN Rab = 1;

Step 2: Base station chooses one of its child Ni at random, Ni test every member of Nj S(Ni) to generate comparison Rij{0,1} using the following method:

Set Rij = 0 and compute tijd ;

IF | tijd | > 1 THEN t

ijc = 1;

ELSE tijc = 0;

IF q1k 2

kijc , THEN Rij = 1;

Step 3: Ni’s neighbor Nj generates a tendency value Tj based upon its neighboring sensors’ comparison value:

Nj test its neighbors as Step 2 IF )N(SN jtjjt

2/)N(SR , where |S(Nj)| is the number

of the Nj’s neighboring nodes THEN Tj = LG; ELSE Tj = LF;

Step 4: Compare the number of Ni’s LG neighboring sensors with different comparison results to determine its status: IF LGTand)N(SN iijjij

2/)N(S)R21(

THEN Ti = GD, go to Step 5; ELSE IF (Ni ’children) THEN choose one of its child Nm at

random, Nm executes the algorithm as Ni; ELSE go to Step 2;

Step 5: For the other sensors in the sub-tree including Ni, do the following method to diagnose them:

FOR the Ni’s child or parent NjIF Rij = 0 THEN Tj = GD; ELSE Tj = FT;

THEN use Nj to diagnose the Nj’s child or parent NcIF Nj = GD THEN

IF Rjc = 0 THEN Tc = GD; ELSE Tc = FT;

ELSE IF Rjc = 0 THEN Tc = FT; ELSE IF q

1kkjc 1)cq( THEN Tc = FT;

ELSE Test every member of Nd S(Nc) to generate comparison Rcd{0,1}

IF )N(SN ccdcd2/)N(SR THEN Tc = GD

ELSE Tc = FT; Just repeat the process to diagnose the remaining sensors through the parent-child relationship

Step 6: For the sensors in other sub-trees, do the following method: FOR Ni’s neighbors

Do the process as step5 to diagnose themselves and their parents or children

Different from step5 IF Nj = FT THEN

IF Nd S(Nc) and Td = GD

THEN Nc compare with NdELSE Nc execute as step5

C. ExampleTo illustrate how the algorithm may be used in practice we

now discuss applying them to an example network in which 9 of the nodes are faulty. The example network is partially depicted in Fig. 2; only the faulty nodes and their nearest responsive neighbors are shown (the complete topology of the network is not needed to demonstrate the scheme). In the following we consider how the algorithm executing in one of the sub-tree.

Figure 2 The execution of the diagnosis algorithm in one of the sub-tree

First, each sensor like A – N tests its parent in the way as specified in step 1 of our algorithm. The tests are shown by dotted lines with double arrow and the values of Rij are given beside them in Fig. 2.

Secondly, base station selects one of its children A at random and A executes step 2, step 3 and step 4 to judge whether it is GD. In this process, A communicates with its neighbors in the transmission range and the result is obtained by two times judgments. If A can’t be judged as GD, the algorithm will let A’s child to execute the process layer by layer until one of its child can be judged as GD. If the algorithm can’t find a GD sensor in this sub-tree, base station will select another child to execute the algorithm. We assume A can be judged as GD in Fig. 2.

Then, we use A according to the definition to judge the remaining sensors. Look at A, as specified in step 5, A is GD and RAB = 0, we can judge that B is GD. After B was judged, we can use it to judge its’ child C. B is GD and RBC = 1, so C must be FT. If RCH = 0 and C is FT, we can judge H as FT. If RCH = 1 the same as RCD = 1, we can’t judge D and H directly. In our algorithm, we judge them by mean of their neighbors in this case. D and H execute the step2 and step3 to judge TD and TH. If C is FT, RCD = 1 and TD = LG, we can judge D as GD. If C is FT, RCH = 1 and TH = LF, we can judge F as FT. Through it, we can judge the sensor layer by layer.

Finally, we can use A to judge its neighbors and they can judge their sub-tree easily like A in step 6. And when their children meet the case of C, such as L, they can judge their children quickly if they have a GD neighbor. Because if a

sensor is GD and Rij is known, its neighbors can be judged easily according to the definition.

V. SIMULATION RESULTS

To evaluate the performance of fault diagnosis, we use C++ as the tool and consider two metrics: correct detection rate (CDR) and false alarm rate (FAR). CDR is the probability that a faulty sensor is diagnosed as faulty. Similarly, FAR is the probability that a faulty sensor is diagnosed as faulty.

The experimental scenario consists of 1024 sensors randomly placed a 32 × 32 square area. The measurement parameter xi is considered to be temperature. The good and faulty values of xi are 70-75 and 100-120, respectively. The measurement parameter q is considered to be 10. We set the threshold value 1 and 2 to be 15 and 2.

In the simulation, sensors are randomly chosen to be faulty with the probabilities of 0.05, 0.10, 0.15, 0.20 and 0.25 respectively under different number of neighbors for each sensor. The number of neighbor sensors n is chosen to be 7, 10, 15 and 20 respectively.

0.05 0.1 0.15 0.2 0.250.95

0.96

0.97

0.98

0.99

1

Sensor fault probability

Cor

rect

det

ectio

n ra

te

n = 7 n = 10 n = 15 n = 20

Figure 3 Correct detection rate in ADFD

Figure 4 False alarm rate in ADFD

Fig. 3 and Fig. 4 show the correct detection rate and false alarm rate in ADFD, respectively. By comparing n=7, n=10,

n=15 and n= 20 in them, it can be seen that the detection accuracy is fairly high and false alarm rate is considerably low, especially in n=20.

VI. CONCLUSION

In this paper, we have presented a distributed algorithm for tree-like fashion of WSN, which was based on [5]. The algorithm inherits the judgment principle proposed in [5] to ensure the accuracy rate and false alarm rate, meanwhile, it collects sensing data q times in comparing the measurement of xi which avoids the intermittent fault efficiently. In addition, compare to the algorithm in [5], our algorithm combined with the tree-like fashion decreases the communication cost effectively. The simulation results show that the CDR is over 96% even when 25% nodes are faulty. The FAR is considerably low. Overall, our algorithm outperforms previous fault diagnosis algorithm proposed in [5]. In our future work, we will implement the algorithm on NS2 sensor network simulators.

REFERENCES

[1] I.F. Akyildiz, W. Su, Y. Sankarasubramaniam, and E. Cayirci. Wireless Sensor Networks: A Survey. Computer Networks, vol. 38, 2002. pp. 393-422.

[2] G.J. Pottie, W.J. Kaiser. Wireless Integrated Network Sensors. Comm. ACM, vol. 43, no. 5, 2000. pp. 551-558.

[3] D.P. Agrawal, M. Lu, T.C. Keener, M. Dong, and V. Kumar. Exploiting the Use of Wireless Sensor Networks for Environmental Monitoring. EM Magazine, Aug. 2004. pp. 27-33.

[4] J. Gao, Y. Xu and X. Li. Weighted-Median Based Distributed Fault Detection for Wireless Sensor Networks. Journal of Software 2007. pp. 1208-1217.

[5] J. Chen, S. Kher, A. Somani. Distributed fault detection of wireless sensor networks. In: Proc. of the 2006 Workshop on Dependability Issues in Wireless Ad Hoc Networks and Sensor Networks (DIWANS 2006). 2006. pp. 65 72.

[6] L.B. Ruiz, I.G. Siqueira, LBe Oliveira, H.C. Wong, JM Nogueira, AAF Loureiro. Fault management in event-driven wireless sensor networks. In: Proc. of the 7th ACM Int’l Symp. on Modeling, Analysis and Simulation of Wireless and Mobile Systems. Venice, 2004. pp. 149 156.

[7] F. Koushanfar, M. Potkonjak, A Sangiovanni-Vincentelli. On-Line fault detection of sensor measurements. In: Proc. of the IEEE Sensors. 2003. pp. 974 979.

[8] M. Ding, D. Chen, K. Xing, and X. Cheng. Localized fault-tolerant event boundary detection in sensor networks. In: Proc. of IEEE INFOCOM 2005, Miami, March 2005.

[9] X. Luo, M. Dong, and Y. Huang. On distributed fault-tolerant detection in wireless sensor networks. IEEE Transactions on Computers, Vol.55, No.1, Jan. 2006. pp. 58-70.

Documents

[IEEE 2008 11th IEEE International Conference on Communication Technology (ICCT 2008) - Hangzhou, China (2008.11.10-2008.11.12)] 2008 11th IEEE International Conference on Communication