7
Effective Probabilistic Approach Protecting Sensor Traffic Xiaoyan Hong*, Pu Wangt, Jiejun Kongt, Qunwei Zheng*, Jun Liu* * Computer Science Department, University of Alabama, Tuscaloosa, AL 35487 t Mathematics Department, University of Alabama, Tuscaloosa, AL 35487 t Computer Science Department,University of California, Los Angeles, CA 90095 Abstract- Sensor networks are often deployed in environ- ments where malicious nodes present. Among all possible forms of the attacks threatening the sensor networks, in this work, we focus on traffic analysis attacks. Typically, in performing traffic analysis, an attacker will eavesdrop on-going wireless transmissions and analyze contents and timing instances of the transmissions so to infer critical events or to trace valuable assets in the network (e.g, data sources or sinks). The paper presents a probabilistic approach to shape the sensor network traffic to de- correlate time instances in transmissions. The security properties of the approach are studied both analytically and empirically, showing strong protection in high probability. I. INTRODUCTION Rapid advances of wireless sensor network technologies have enabled numerous applications in civilian, law enforce- ment and military environments, from large scale environmen- tal monitoring, structural defect monitoring, border intrusion tracking, hostile terrain surveillance, to small scale personal health monitoring. Many applications of wireless sensor net- works require strong security protection over the network and over the data collected and transmitted in the network. Among various security threats, traffic analysis is one that has been made easier in the wireless medium than the wired lines, in that a malicious node can eavesdrop all the wireless transmissions without physically attaching to a line or com- promising a node. Though the wireless communications can be protected by strong cryptographic methods at application (end-to-end) or at link level, an eavesdropper can still obtain information on traffic pattern, traffic change pattern, or end- to-end traffic flow paths [8] [9]. With such information, the eavesdropper can infer the events happening in the network, or trace the paths to the sources/destinations of the flows or the locations of the events. More deadly consequences could directly result from the information. Thus countermeasures that prevent sensor traffic against traffic analysis have to be studied as an important sensor security issue. Traffic analysis [1] can be performed on both packet contents (content analysis) and timing events of transmissions (timing analysis). Typical countermeasures to content analysis are padding data packet into a constant length and using per hop key to encrypt packets. When source and destination addresses in packet headers are concerned, per hop link level encryption or link pseudonyme can be used [11]. With these methods, every transmission looks different to an observer. Though encryption and decryption is considered expensive with regard to a sensor's resource constraints, measurements have shown that the overhead is tolerable, e.g., the times that a Berkeley MICA1 Mote sensor takes to encrypt and decrypt 30 bytes data using RC5 are 1.94 and 2.02 msec respectively [6] [7]. In addition, encryption serves a broad range of security purposes like data integrity, content privacy, access authentication, etc. Timing analysis [15] can not be prevented with the en- cryption methods. Successful correlation between two trans- mission events reveals the path of a data flow. Solutions studied for wired networks [2][4][14][17] use traffic mixing techniques including sending messages in reordered batches, sending decoy/dummy messages, and introducing random delays. However, sensor networks have features different from a wired network, requiring reevaluation of the existing work and new designs of anti timing analysis solutions. Features of wireless sensor networks include constraints on the resources each sensor possesses, namely, computing, storage, bandwidth and energy. Such constraints raise con- cerns on a scheme that requires decoy packets due to the extra usage of bandwidth and energy. On the other hand, communications among sensors is multiplexed within a radio range (depending on encryption method) in the wireless medium, which helps hiding a particular transmission among neighbors' transmissions. Our study, thus, adopts the random delay approach instead of using decoy packets and exploits the multiplexing feature of the wireless medium to protect sensor traffic from timing analysis. We base our work on the assumption that a sender is able to hide the identity of its intended receiver in transmission but not its own. The rationales behind this assumption are that: on one hand, the sender can either use link level encryption or simply broadcast its messages to hide the receiver. The intended receiver shall use the right key to decrypt the mes- sages; on the other hand, an eavesdropper assisted with radio detection technique can detect a wireless transmission from a referable location and name it. By discovering the temporal dependency of two transmissions at different locations, he acquires the knowledge about the relationship of two consec- utive nodes along a data path. When attackers have substantial network-wide monitoring ability, or physical mobile tracking ability, they will be able to integrate piecewise information into global knowledge to finally infer the locations of the events or the base stations in the sensor network. In this paper, we study random delay strategies to prevent temporal correlation of wireless transmissions within a neigh- borhood. Specifically, our schemes introduce random delays independently and distributively at each hop. Our strategies do not use dummy messages but exploit the broadcast media in improving the effectiveness and the efficiency of the schemes. The key of such design is how to derive the delay and what the anonymity properties will be. We introduce two schemes and analyze their anonymity properties in the paper. The work

[IEEE MILCOM 2005 - 2005 IEEE Military Communications Conference - Atlantic City, NJ, USA (17-20 Oct. 2005)] MILCOM 2005 - 2005 IEEE Military Communications Conference - Effective

Embed Size (px)

Citation preview

Page 1: [IEEE MILCOM 2005 - 2005 IEEE Military Communications Conference - Atlantic City, NJ, USA (17-20 Oct. 2005)] MILCOM 2005 - 2005 IEEE Military Communications Conference - Effective

Effective Probabilistic Approach Protecting Sensor Traffic

Xiaoyan Hong*, Pu Wangt, Jiejun Kongt, Qunwei Zheng*, Jun Liu**

Computer Science Department, University of Alabama, Tuscaloosa, AL 35487t Mathematics Department, University of Alabama, Tuscaloosa, AL 35487

t Computer Science Department,University of California, Los Angeles, CA 90095

Abstract- Sensor networks are often deployed in environ-ments where malicious nodes present. Among all possible formsof the attacks threatening the sensor networks, in this work,we focus on traffic analysis attacks. Typically, in performingtraffic analysis, an attacker will eavesdrop on-going wirelesstransmissions and analyze contents and timing instances of thetransmissions so to infer critical events or to trace valuable assetsin the network (e.g, data sources or sinks). The paper presents aprobabilistic approach to shape the sensor network traffic to de-correlate time instances in transmissions. The security propertiesof the approach are studied both analytically and empirically,showing strong protection in high probability.

I. INTRODUCTION

Rapid advances of wireless sensor network technologieshave enabled numerous applications in civilian, law enforce-ment and military environments, from large scale environmen-tal monitoring, structural defect monitoring, border intrusiontracking, hostile terrain surveillance, to small scale personalhealth monitoring. Many applications of wireless sensor net-works require strong security protection over the networkand over the data collected and transmitted in the network.Among various security threats, traffic analysis is one thathas been made easier in the wireless medium than the wiredlines, in that a malicious node can eavesdrop all the wirelesstransmissions without physically attaching to a line or com-promising a node. Though the wireless communications canbe protected by strong cryptographic methods at application(end-to-end) or at link level, an eavesdropper can still obtaininformation on traffic pattern, traffic change pattern, or end-to-end traffic flow paths [8] [9]. With such information, theeavesdropper can infer the events happening in the network,or trace the paths to the sources/destinations of the flows orthe locations of the events. More deadly consequences coulddirectly result from the information. Thus countermeasuresthat prevent sensor traffic against traffic analysis have to bestudied as an important sensor security issue.

Traffic analysis [1] can be performed on both packetcontents (content analysis) and timing events of transmissions(timing analysis). Typical countermeasures to content analysisare padding data packet into a constant length and using perhop key to encrypt packets. When source and destinationaddresses in packet headers are concerned, per hop link levelencryption or link pseudonyme can be used [11]. With thesemethods, every transmission looks different to an observer.Though encryption and decryption is considered expensivewith regard to a sensor's resource constraints, measurementshave shown that the overhead is tolerable, e.g., the timesthat a Berkeley MICA1 Mote sensor takes to encrypt anddecrypt 30 bytes data using RC5 are 1.94 and 2.02 msec

respectively [6] [7]. In addition, encryption serves a broadrange of security purposes like data integrity, content privacy,access authentication, etc.

Timing analysis [15] can not be prevented with the en-cryption methods. Successful correlation between two trans-mission events reveals the path of a data flow. Solutionsstudied for wired networks [2][4][14][17] use traffic mixingtechniques including sending messages in reordered batches,sending decoy/dummy messages, and introducing randomdelays. However, sensor networks have features different froma wired network, requiring reevaluation of the existing workand new designs of anti timing analysis solutions.

Features of wireless sensor networks include constraintson the resources each sensor possesses, namely, computing,storage, bandwidth and energy. Such constraints raise con-cerns on a scheme that requires decoy packets due to theextra usage of bandwidth and energy. On the other hand,communications among sensors is multiplexed within a radiorange (depending on encryption method) in the wirelessmedium, which helps hiding a particular transmission amongneighbors' transmissions. Our study, thus, adopts the randomdelay approach instead of using decoy packets and exploitsthe multiplexing feature of the wireless medium to protectsensor traffic from timing analysis.We base our work on the assumption that a sender is able

to hide the identity of its intended receiver in transmission butnot its own. The rationales behind this assumption are that:on one hand, the sender can either use link level encryptionor simply broadcast its messages to hide the receiver. Theintended receiver shall use the right key to decrypt the mes-sages; on the other hand, an eavesdropper assisted with radiodetection technique can detect a wireless transmission from areferable location and name it. By discovering the temporaldependency of two transmissions at different locations, heacquires the knowledge about the relationship of two consec-utive nodes along a data path. When attackers have substantialnetwork-wide monitoring ability, or physical mobile trackingability, they will be able to integrate piecewise informationinto global knowledge to finally infer the locations of theevents or the base stations in the sensor network.

In this paper, we study random delay strategies to preventtemporal correlation of wireless transmissions within a neigh-borhood. Specifically, our schemes introduce random delaysindependently and distributively at each hop. Our strategies donot use dummy messages but exploit the broadcast media inimproving the effectiveness and the efficiency of the schemes.The key of such design is how to derive the delay and whatthe anonymity properties will be. We introduce two schemesand analyze their anonymity properties in the paper. The work

Page 2: [IEEE MILCOM 2005 - 2005 IEEE Military Communications Conference - Atlantic City, NJ, USA (17-20 Oct. 2005)] MILCOM 2005 - 2005 IEEE Military Communications Conference - Effective

is presented as follows. We start with brief reviews on relatedterminologies and previous work in Section II, follow by thedescriptions of the schemes in Section III. Section IV analyzesthe security properties of the schemes, and Section V validatesthem through simulations. Section VI concludes the paperwith future work outlined.

II. BACKGROUND AND RELATED WORK

A. Terminology

The concept of anonymity is defined as "the state ofbeing not identifiable within a set of subjects, the anonymityset" [13]. The anonymity set is the set of the identities ofpossible senders/recipients [5][16]. The larger the set is, thebetter the anonymity protection will be. Further, anonymity isdefined in terms of the relationship of a sender and a receivernot to be identified. If an algorithm always provides the size ofthe anonymity set to be greater than 1, the algorithm providesdeterministic anonymity. Otherwise, if an algorithm has aprobability that the size of the set goes to 0 in an exponentiallydecreasing rate given a linear increasing complexity parameterof the algorithm, the algorithm is said to provide probabilisticanonymity [10].

B. Related work

Strategies thwarting timing analysis are studied withinmany Internet anonymity protocols, e.g., with various MIX-Net designs [3][10][14]. The strategies include introducingrandom delay; buffering messages in a pool, reordering mes-sages and flushing the pool; injecting dummy/decoy packets;and padding each packet to the same length or random lengths.For example, a delay-and-playout technique (traffic mixing)is used in [2][14]. Serjantov et al. [17] classify traffic mixingstrategies into two major categories simple MIXes and poolMIXes. Both categories use either one threshold or both: amessage pool size n and/or a time period t that a messagestays in the pool. Decoy messages could be used whennecessary. These threshold based schemes are vulnerable toeither flooding attacks, where the adversary sends its own n-Imessages to flush a pool of size n; or trickle attacks, where theadversary blocks all the incoming messages except the targetone until the mix node sends it out after t; or the two blended.Various mixing schemes have been proposed to reduce thesuccess ratio of such attacks [17].A close related work is Stop-and-Go-MIX (SG-MIX) [10].

SG-MIX does not use thresholds on a message pool or atime window, nor decoy messages. Rather, it uses independentrandom delays for each message. In SG-MIX, the sourcepre-computes a route passing through a few MIX nodesand values of random delays at each node. The values areencrypted by the keys of each node en route and embed-ded in the message header. SG-MIX achieves probabilisticanonymity, i.e., two transmissions could be correlated withan exponentially decreasing probability when traffic intensityincreases linearly. However, SG-MIX can not be used inwireless sensor networks due to many reasons: (i) The ene-to-end approach is not suitable for distributed sensor networks;(ii) pre computed data paths can not be obtained in sensor

networks; (iii) cryptographic keys are more likely managedlocally in sensor networks; (iv) many applications use mobilesensor networks; and moreover, (v) while per flow treatmentby SG-MIX is not a problem for a wired line because MIXesusually have high traffic volume, inputs to a wireless sensornode is not as high as those to a wired MIX. Our schemes usea distributed approach and exploit the broadcast nature of thewireless medium to facilitate anti-traffic analysis strategies.

For wireless multihop networks, preventing traffic analysishas been studied as a routing problem. Jiang [9] presenteda routing algorithm that finds appropriate routing paths forvarious end-to-end flows aiming at maintaining global linktraffic patterns as invariable as possible. Our work differsfrom the work in that we model aggregated traffic in a localtransmission radius. For sensor networks, anti-traffic analysisstrategies have also been presented in [6], where windowbased delay randomization and dummy packets are used. Ourapproach does not use dummy packets, rather, longer delaysin trading for bandwidth. By exploiting the wireless media,we are able to control the overall average delay.

III. PROBABILISTIC TRAFFIC SHAPING

In this section, we describe two random delay strategiesto shape data traffic so that direct correlation of a previoustransmission (by the upstream node) with the current oneis impossible with high probability. Our strategies do notuse dummy messages but exploit the broadcast media inimproving the effectiveness and the efficiency of the schemes.Analysis of the anonymity properties will be given in the nextsection.

A. Network ModelIn our targeted sensor network scenario, all the sensors

independently generate data reports and relay reports for someof its neighbors. The packets are padded to a constant size.We assume all the sensors have a radio range of r. Sensorswithin the range r can either send to or receiver from eachother. Our assumptions on sensor security follow the onesused by [6][12], i.e., sensors have established pair-wise keysto encrypt transmissions at each hop and also to decrypt thepackets and process them at the intended receivers within thetransmission range. A further relay of the message will bere-encrypted to generate a different payload pattern. Note thisnetwork security model is vulnerable to localized DOS attacks(energy depletion) but the damage will not propagate further.

Using current radio detection technique, an eavesdroppercan easily detect a wireless transmission from a referablelocation. It can identify this transmission using its own coor-dination system and name convention. When it intercepts twoconsecutive transmissions from two sensors that are withinthe range r, the eavesdropper can perform traffic analysis onthe two packets for content correlation and timing correlation.With the above sensor security assumption, content correlationis impossible (is hard or resource consuming). Because arelay node will re-encrypt the content with a different key,the same data payload appears differently at every relaytransmission. However timing correlation is possible if onepacket is sent after a reasonable delay counting the processing

Page 3: [IEEE MILCOM 2005 - 2005 IEEE Military Communications Conference - Atlantic City, NJ, USA (17-20 Oct. 2005)] MILCOM 2005 - 2005 IEEE Military Communications Conference - Effective

and media access control. More over, a compromised node ismore dangerous in that it can act protocol-compliantly butto break the privacy and security system. For example, itcan generate its own payload and monitor who will relaythe packet. When an adversary detects the relation betweenupstream and down stream nodes, it can trace the routing pathtowards either the data source or the destination (base station)by physical movements or by feeding information to its globaladversarial information center for integration with other data.These actions could lead to deadly consequences.

B. Distributed Random Delay StrategiesCommunications in wireless sensor networks typically lack

global centralized control and ene-to-end knowledge. MIX-net techniques, including SG-MIX, can not be used directly.Our attempt is to apply the end-to-end SG-MIX scheme ina distributed way, i.e., each node along a path independentlyintroduces a random delay for the packet it relays. Thus weare able to avoid the many drawbacks of original SG-MIX.Our first scheme thus works as follows. In the scheme, a

sensor node independently generates data packets and emitsthem immediately after generation. A sensor also relayspackets for its neighbors. When a packet is received, it willbe delayed for a while before being emitted again. The valueof the delay draws from an exponential distribution withparameter ,u. When a transmission is delayed, the packet isput into a queue until the delay time expires. During the delayperiod or even before receiving the packet, the neighborsof the node may transmit or have transmitted packets, orthe node itself may have originated packets. Thus whenthis node transmits the delayed packet, an eavesdropper maynot be able to tell whether the current transmission is anoriginated data or a relay of an early transmission; or, towhich early transmission this packet might relate. Thus arelay transmission is hidden among its own data, neighbors'stransmissions or its other relay traffic.

The scheme presents the worst case when a packet isemitted after delay but with no mixing traffic. When thishappens, an eavesdropper has the highest probability (500 o) inguessing whether this transmission is a relay of the previousone or an emission by the node itself. If an eavesdropper isable to identify idle periods of the system, especially, whentraffic load is not high, he will have a higher probability incorrelating events (we defer analysis to the next section). Tobe more specific, let's define the following events.

* Event A: the queue is empty when the packet first arrivalsat the node;

* Event B: During the delay period, there is no othertransmissions in the neighborhood.

* Event C: During the delay period, there is no packetsoriginated by the node itself.

Thus at the time of an expiration, if at least one of thethree events does not occur, immediate transmission of thecurrent packet is considered safe because it is mixed withother transmissions when observed by another node in theneighborhood. However, when all of the events A, B andC have happened, transmission of the packet is less safebecause this is the only transmission following the previous

transmission by a neighbor. We will show that such probabilityis very low.We extend the above scheme to further eliminate the

worst case by introducing a second or more delay(s) if eachexpiration of delay returns finding event A happened at thefirst place and all the later delays have both events B and Chappened. There are several issues relate to this extension.First, this could lead to long total delay depending on trafficload. We argue that since a source always sends its packetswithout delaying, any long delay will eventually end whenany of its neighbors or itself sends a packet. The occurrenceof more than one delays has decreasing smaller probabilities.Second, we introduce a delay upper bound Dmax at eachhop. The upper bound enables the control over the overallend-to-end delay to meet QoS requirements. Especially, ifthe sensor network is delivering time critical data, the upperbound provides a tradeoff for optimal anonymity control andperformance. Finally, this method creates more chances fora malicious node to perform trickle (or blocking) and floodattack - only one packet is enough to trigger an emission fromits neighborhood. When this happens, this scheme reverts tothe previous one. However, a successful trickle and floodattack has also very small probability in success, given thatsensor sources are uniformly distributed and each sourcegenerates packets independently.We name the basic scheme probabilistic reshaping

(PRESH) and the extended one extended probabilistic re-shaping (exPRESH). In the next section, we will provideanalysis on the security properties of the two schemes. Thedistributions enable a better understanding, especially aboutthe trade-offs between delay and protection.

IV. SECURITY ANALYSIS

A. AttacksAn eavesdropper in the system running PRESH or

exPRESH has no predictable clues in correlating two trans-missions since each packet is delayed independently andrandomly. Transmission events {ei, e2, ... eh} that the eaves-dropper has collected during a time window Th do nothelp him in guessing which event the next transmission willmost likely relate to. Each event has the same probabilityof h However, if an eavesdropper is able to monitor theneighborhood for a long time, he might be able to identifythe busy period and the idle period of the system, especially,when traffic load is not high. Thus, when a packet enters thesystem with an empty queue, and when it is emitted againbut there is no other transmissions in the neighborhood up tothe time, the eavesdropper can correlate the two events mostsuccessfully. This presents the worst case for the two schemes.

In the wireless medium, a single compromised node canonly block traffic passing through itself. To successfullybreak the mixing scheme supported by the neighborhood,the adversary has to compromise a number of nodes in theneighborhood. These nodes then act coordinately to block thetransmissions in the neighborhood except the target packetfor a period in order to empty the queue and to wait forthe target packet to come out. If the adversary could obtainDmax, he can block the neighborhood for the same period to

Page 4: [IEEE MILCOM 2005 - 2005 IEEE Military Communications Conference - Atlantic City, NJ, USA (17-20 Oct. 2005)] MILCOM 2005 - 2005 IEEE Military Communications Conference - Effective

increase blocking success probability. Then exPRESH revertsto a threshold-based scheme. The success rate of such attackdepends on the time period the adversary spends in blockingthe traffic. After that the adversary only needs to send oneadditional packet to convince the target packet to come outafter its delay period. In this case, exPRESH reverts toPRESH. After all, launching such attacks requires very strongadversary.

B. Assumptions

For the convenience of the analysis, we assume transmis-sion delay and processing delay are negligible compared tothe random delay we introduced. Benchmark measurementsof Berkeley Mote MICA sensors show that transmission ofa sensor data packet uses 40-50 ms, encryption/decryptionrequires about 2 ms using RC5 for 30 bytes data, and sensorsgenerate data at a rate of 60s [6]. Thus it is safe to assume amean of random delay at 60s or tens seconds. The aggregatedtransmissions over the neighborhood, provide enough mixingtraffic.We base our analysis on the uniformity of node distribution

and traffic distribution. In a neighborhood with m nodes,each node originates data independently and each has equalprobability in relaying data for its neighbors. We simplifythe problem by assuming a packet flow will be relayed ina neighborhood only once. We also simplify the model bynot differentiating a specific node from its neighbors sincewe study the worst case. Thus, we model the traffic thateach node generates (both originating and relaying) as aPoisson process with parameter A. The aggregated traffic ina neighborhood is a Poisson distribution with parameter mA(including the node itself). At a specific node, a portion ofthe arrivals (at rate of A) will be processed (here, delayed andthen re-emitted) with a service time exponentially distributedwith parameter ,u immediately at arrival. From the viewpoint of an eavesdropper, the system we are modelling isthe neighborhood around it. So we redefine the events thatlead to the worst case, i.e., a transmission of a packet directlyfollowing its previous transmission, to the following:

* Event A': the queue is empty when the packet arrivalsat a node;

* Event B': during one delay period, there is no transmis-sions in the neighborhood.

Based on this aggregated neighborhood traffic model, we ana-lyze the anonymity properties for both PRESH and exPRESHschemes.

1) Probability ofworst case: The worst case happens whenthe two events A' and B' occur. The probability is

P(worstCase) = P(A' n B') = P(A')P(B')

For event A', P(A') is simply the probability that thesystem is empty at an arrival, which is e- M . For event B',the probability that during a service time there is no arrival isequal to the probability that an sample drawn from Exp(u) islarger than a sample from Exp(mA), which is mA('f. Thus,we have

P(worstCase) (1)1+p

2) Size of anonymity set: When a packet is emitted, thepacket is mixed within the anonymity set. The anonymity setof a packet X consists of the packets in the queue when Xarrivals, X itself and the packets arriving during the servicetime of X. The expectation of the number in the queue whena packet arrives is mA. And the expectation of the number

A

MA

arrivals during the busy period of a packet is e 2+±1 Thusthe expectation of the size of the anonymity set Uo is:

Uo =p+ 'eP + 12

(2)

Note that fimpO Uo = 1, i.e, when traffic load is verylow, the only transmission would be the packet itself. At thattime, the scheme fails to protect X by mixing it with others:limp-O P(worstCase) = 1.

3) Success rate ofblocking attack: Suppose a time intervalT is used by the adversary to block the traffic and then to sendits own one packet. The blocking attack will success if all thepackets in the system leave within time T. The memorylessproperty gives us the probability that a packet in the systemleaving within T as P[X < T] = -e-8T. At the time theblocking attack starts, the system could have arbitrary numberof packets, say i, with a probability of P[X = i] = P e-P.The probability that all of them leave the system within timeT is V[allLeaveX i] (P[X < T])' (1-e-LL)t Thus,the expectation of success probability of blocking attack is

Vo (T) >LP[X i](P[X < T])i > eI-P(I L-T)iti=O i=O

exp( ) (3)

Equation (3) shows that if a blocking interval T is set,the success probability decreases exponentially with the lineardecrease of the delay parameter ,u.

C. PRESH

For PRESH, a packet will be delayed only once. We modelthe scheme as a m/m/oo queue system, where the arrivalsare in a Poisson distribution with parameter mA, and infinitenumber of servers are able to serve each arrival immediatelywith a service time exponential distributed with parameter ,u.With this model, the analysis for PRESH follows the workpresented in [10] directly. We give the main results belowwith a definition on traffic intensity p = mA/,u.

D. exPRESH

For exPRESH, additional delays will be used whenPRESH's worst case occurs with a probability ofP(exPRESH) = P(worstCase). When that happens,an independent exponential delay starts. The procedurerepeats thereafter if the message terminates its service butfinds there is no arrival to the system during its last servicetime. The procedure stops until at least one new message

Page 5: [IEEE MILCOM 2005 - 2005 IEEE Military Communications Conference - Atlantic City, NJ, USA (17-20 Oct. 2005)] MILCOM 2005 - 2005 IEEE Military Communications Conference - Effective

has arrived or the maximum delay bound Dmax has beenreached, then the packet is emitted. When the scheme isunder blocking attack, the best strategy the attacker will useis to force a packet to come out after the first delay. In thatcase, exPRESH reverts to PRESH. The key performanceissue for exPRESH is the trade off between a longer delayand a zero probability of worst case. The total delay time apacket experiences becomes a critical factor. We model thesystem using the state transition diagram shown in Figure 1.In this model, the bound is not considered.

In the diagram, n, n > 0, denotes the states of queuelength n. States IF and IL are the steady states for thefirst and last message during a busy period in the system.The corresponding probabilities of queue length in steady-states are XJn X1L and X1F are for steady states IF and ILrespectively. We have the following balance equations:

mAxoTnAxlF

-0co

AX1L

mAxo

Note that when mA = A, fD (t) becomes a Gammadistribution with parameters mA and 2. Figure 2 illustrates thedistribution of the total delay needed for a successful mixingwith t = 1 packet/second. A reference curve of Exp(At)for PRESH is given too. The figure shows that when p = 5,exPRESH and PRESH have very close probabilities at longdelay times. Owing to neighborhood multiplexing, p = 5 iseasy to reach. For example, for a neighborhood with 10 activemembers, the mean delay could set to only a half of the meanpacket interarrival time.

p = 0.5 ep = 1.0

5.0 -----

0.8 reference, Exp(~.t)---

0.6

0.4

0.2

(mA + P)X1L 2AtX2(mA + 2At)X2 nmAxlF + mnAxL + 3A1X3

. . .

(mA + np)xJ = mmAxn-, + (n +l)x,n+,ln > 2

The derivation of xS is given in the Appendix I. An explicitexpression for xo is

e-PpO= (I +p) e-P (4)

Note that limp-o0 2=1. Based on the probabilities, weare able to give the following properties.

1i

ng (n+ 1)g

2/k

Fig. 1. State Transition Diagram

1) Delay Distribution: For exPRESH, a packet may bedelayed for many terms till there is at least one arrival (packetM) during its last delay term. Thus the total period D a

packet is delayed (the busy period) can be divided into twosub periods T1 and T2, where T1 is the time for the firstpacket M to arrival in a busy period, and T2 is the residualservice time after M arrivals. D = T1 + T2. Due to thememoryless property of the arrival and delay processes, T1has the exponential interarrival distribution with parametermA, and T2 has the exponential service time distribution withparameter t. The probability density function of D is:

Jt

fD (t) = mAe-' /eI(C s) ds

mAe[tt (1 -e-(mA -Wt)) ifmA #tp

TmAAte-t, if mA =

(5)(6)

O00 2 4 6 8 10

Time (at Rt = 1 packet/second)

Fig. 2. Distribution of Delay

2) Mean total delay time: Now we compute the mean ofthe total delay for exPRESH. The mean total delay time forany message can be computed by conditioning on the systemis empty or not when the message arrives at the system. Whenthe system is empty, the massage has to wait until one arrives.Thus,

mA Dxo) 1

At

1 xo+ mA

At MA

Substituting xo yields the mean total delay time

DI = l + (I + p) e-P) (7)

One can see that when the traffic intensity p is large, themean total delay is just the mean service time At- . When pis small, however, the total delay increases because the firstmessage in the busy period has to wait for a new one to come.The additional delay xo/mA could be substantially large. Anillustration of the function is given in Figure 3, where themean total delay is normalized to the mean service time. Inorder to demonstrate the convergence to 1, X axis starts at0.25 in the figure.

V. SIMULATIONS

We investigate the influence of aggregated traffic on themixing schemes PRESH and exPRESH through simulation.We use GlomoSim[18], a packet level simulator for wirelessand wired networks. In the simulation, 225 nodes are uni-formly distributed in a square field at an average nodal densityof 16 nodes per transmission radius. In an attempt to collectdata from uniform traffic pattern, we set sinks on one side ofthe field and all the data flow towards the same direction. Thestatistics are only collected from the nodes at the other side ofthe field. Those nodes also have their neighborhood within the

Page 6: [IEEE MILCOM 2005 - 2005 IEEE Military Communications Conference - Atlantic City, NJ, USA (17-20 Oct. 2005)] MILCOM 2005 - 2005 IEEE Military Communications Conference - Effective

2.8 -

2.6 -

2.478

0.6Normalized Mean Total Delay Time

0.5

0

2.2

w

2a).Ng 1.8

z 1.6

1.4 - Xx

1.2

1 ^o^^oooooooo1 2 3 4 5

p = mX/

Fig. 3. Mean Total Delay Time Normalized to ,u

-0-02n

0.4

0.3

0.2

0.1

0

0 0.05 0.1 0.15 0.2 0.25 0.3 0.35 0.4 0.45 0.5Traffic Intensity

Fig. 4. Probability of Worst Case

simulation area. Even though, for the simulations conducted,we can not enforce that one flow being relayed only once ina node's neighborhood. We run the simulations over multipleseeds for the random number generator. The results shouldreflect accuracy statistically. In simulations, we set the mean

date generating interval to be fixed at one report every 60seconds [6]. The mean delay time varies from one second to60 seconds. At short delay intervals like one second, a nodedoes not have enough aggregated transmissions to have at leastone packet on average in order to hide its own traffic. The Xaxis is the relay traffic intensity p0 for one node, i.e, A/,u. Thuswhen each node generates packets every 60s on average in a

neighborhood of 15 nodes, a node has to wait for 4 seconds(p0 = 0.067) on average to hear another transmission. If a

node delays a packet longer, it has better chance for mixing.We collect statistics to calculate average P(worstCase) for

PRESH and the size of the anonymity set for both schemes.For exPRESH scheme, the worst case occurs when the total ofextended delays exceeds the bound Dmax. At that time, even

without an arrival, the packet has to be emitted. We count theoccurrences of this event to calculate the P(worstCase) forexPRESH. Obviously the probability is affected by the valueof Dmax In the simulation, we vary Dmax to be one, two or

three times of the chosen mean delay for exPRESH.Figure 4 gives the probabilities for the worst cases. The

possibilities approach zero quickly when the traffic intensityincreases. Clearly exPRESH schemes decrease faster thanPRESH. In addition, the larger the delay bound, the smallerthe probabilities. The result suggests that we can trade delayfor stronger security protection. Figure 5 gives the sizes ofthe anonymity sets corresponding to the above configuredparameters. The sizes increase when load increases. The twoschemes demonstrate little differences due to the fact thatmost packets experienced the same neighborhood activities.Differences exist when load is low.

The distribution of the total delays that packets experiencedat each node is illustrated in Figure 6 and 7 with mean delaysetting to one second and 60 seconds respectively. For bothfigures, the bound Dmax is enforced in such a way that a

delay drawn from the Exp( must be finished even if Dmaxis met. This allows us to show the full distribution smoothly.Otherwise, one can expect the curves end with a peak at thecorresponding Dmax values. When calculating probability, theplacement of bins affects the shape of the curves slightly.Here, each figure takes 20 bins. The corresponding value is

N

1U,coW1

50

45

40

35

30

25

20

15

10

5

0

0 0.05 0.1 0.15 0.2 0.25 0.3 0.35 0.4 0.45 0.5Traffic Intensity

Fig. 5. Size of Anonymity Set

drawn at the right side of a bin. For example, a point drawnat "Per Hop Delay = Is" in Figure 6, counts data fallen inthe bin of (0.5,1]. This particular case actually counts for theleverage of a possible higher peak at "x = Is".

In Figure 6, PRESH shows exponential decay trend of theprobability with regard to the delay time, which is expected.For exPRESH, more terms of delays will be used since theprobability of the worst case occurring after the first delayis relatively high at one second as the mean delay time.The figure shows that exPRESH exhibits higher probabilitiesof longer overall delays than those of PRESH. The curves

shift towards the right side of PRESH. Especially, largerDmax shows more shifting corresponding to longer delays.When the mean delay reaches 60 second (Figure 7), all theschemes reveal the same distribution. This is because that theaggregated traffic intensity is so high that all the packets willbe transmitted after one delay no matter what scheme is inuse. The distribution converges to the exponential distributionwith ft.

VI. CONCLUSION AND FUTURE WORK

The paper has presented traffic shaping schemes to coun-

termeasure traffic analysis over sensor data transmissions.PRESH and exPRESH both use random delays to de-correlatetransmission events without introducing extra bandwidth over-

head for sensors. The approaches also exploit the trafficmultiplexing ability of the wireless medium to reduce theoverall delay and to reduce the probability of the worst case.

exPRESH presents an improvement over PRESH to eliminatethe occurrence of the worst case at the cost of longer ene-to-end delays. Analysis on the anonymity properties and simu-

exPRESH, 3 .MeanDelayexPRESH, 2*MeanDelayexPRESH, 1*MeanDelay

PRESH

---------- ---1--- ---------l

exPRESH, 3*MeanDelayexPRESH, 2*MeanDelayexPRESH, 1*MeanDelay

PRESH-;--9 x

-'P

I

Page 7: [IEEE MILCOM 2005 - 2005 IEEE Military Communications Conference - Atlantic City, NJ, USA (17-20 Oct. 2005)] MILCOM 2005 - 2005 IEEE Military Communications Conference - Effective

0.35

0.3

0.25

-02

n

0.2

0.15

0.1

0.05

0

0 1 2 3 4 5 6 7Per Hop Delay (s)

8 9

Fig. 6. Distribution of Delay at Mean = 1 second

-0co

0.2

0.18

0.16

0.14

0.12

0.1

0.08

0.06

0.04

0.02

0

0 50 100 150 200 250Per Hop Delay (s)

Fig. 7. Distribution of Delay at Mean = 60 seconds

lations are given in the paper. The main results are

very high probability, the schemes successfully mdata transmissions among neighborhood radio activthe insecure facts of the schemes decrease exponentlinear increase of mean delay time.Some applications of sensor networks do not de

uniform traffic patterns, e.g, applications with only a

stations will show traffic concentration near the baseOur future work will study traffic shaping strategiestraffic patterns. We will also analyze the impact oion neighborhood mixing technique.

REFERENCES

[1] A. Back, U. M6ller, and A. Stiglic. Traffic Analysis AttacksOffs in Anonymity Providing Systems. In I. S. Mosko)Fourth Internationl Workshop on Information Hiding (IH'Notes in Computer Science, 2137, pages 245-257, 2001.

[2] 0. Berthold, H. Federrath, and M. Kohntopp. Project AncUnobservability in the Internet. In Computers Freedom c

Conference 2000 (CFP 2000), Workshop on Freedom andDesign, 2000.

[3] 0. Berthold, H. Federrath, and S. Kopsell. Web MIXes: Aanonymous and unobservable Internet access. In H. FedefDIAU'00, Lecture Notes in Computer Science 2009, page2000.

[4] 0. Berthold and H. Langos. Dummy Traffic againstIntersection Attacks. In R. Dingledine and P. SyversProceedings ofPrivacy Enhancing Technologies WorkshopLecture Notes in Computer Science 2482, pages 110-128,

[5] D. L. Chaum. The Dining Cryptographers Problem: UiSender and Recipient Untraceability. Journal of CryptoloJ75, 1988.

[6] J. Deng, R. Han, and S. Mishra. Intrusion ToleranceTraffic Analysis Strategies for Wireless Sensor NetworksInternational Conference on Dependable Systems and Netwpages 594-603, 2004.

[7] P. Ganesan, R. Venugopalan, P. Peddabachagari, A. Dean, F. Mueller,and M. Sichitiu. Analyzing and Modeling Encryption Overhead forSensor Network Nodes. In the 2nd ACM International Conference on

Wireless Sensor Networks and Applications (WSNA), Sept. 2003.[8] A. Hintz. Fingerprinting Websites Using Traffic Analysis. In R. Din-

gledine and P. Syverson, editors, Proceedings of Privacy EnhancingTechnologies Workshop (PET2002), Lecture Notes in Computer Science2482, pages 171-178, 2002.

[9] S. Jiang, N. Vaidya, and W. Zhao. Routing in Packet Radio Networks toPrevent Traffic Analysis. In IEEE Information Assurance and SecurityWorkshop, 2000.

[10] D. Kesdogan, J. Egner, and R. Buschkes. Stop-and-go MIXes Provid-ing Probabilistic Security in an Open System. Second International

10 oWorkshop on Information Hiding (IH'98), Lecture Notes in ComputerScience 1525, pages 83-98, 1998.

[11] J. Kong and X. Hong. ANODR: ANonymous On Demand Routingwith Untraceable Routes for Mobile Ad-hoc Networks. In ACMMOBIHOC'03, pages 291-302, 2003.

_ [12] C. Ozturk, Y Zhang, and W. Trappe. Source-Location Privacy inEnergy-Constrained Sensor Network Routing. In ACM SASN, pages

88-93, 2004.[13] A. Pfitzmann and M. Kohntopp. Anonymity, Unobservability, and

Pseudonymity - A Proposal for Terminology. In H. Federrath, editor,DIAU'00, Lecture Notes in Computer Science 2009, pages 1-9, 2000.

[14] A. Pfitzmann, B. Pfitzmann, and M. Waidner. ISDNMixes: UntraceableCommunication with Very Small Bandwidth Overhead. In GIIITGConference: Communication in Distributed Systems, pages 451-463,1991.

[15] J.-F. Raymond. Traffic Analysis: Protocols, Attacks, Design Issues,and Open Problems. In H. Federrath, editor, DIAU'00, Lecture Notesin Computer Science 2009, pages 10-29, 2000.

[16] A. Serjantov and G. Danezis. Towards an Information Theoretic Metric300 for Anonymity. In R. Dingledine and P. Syverson, editors, Proceedings

of Privacy Enhancing Technologies Workshop (PET 2002), Lectures Notes in Computer Science 2482, pages 41-53, 2002.

[17] A. Serjantov, R. Dingledine, and P. F. Syverson. From a Trickle to a

Flood: Active Attacks on Several Mix Types. In F. A. P. Petitcolas,that with editor, Fifth International Workshop on Information Hiding (IH'02),

lix sensor Lecture Notes in Computer Science, 2578, pages 36-52, 2002.[18] UCLA Parallel Computing Laboratory and Wireless Adaptive

zities; and Mobility Laboratory. GloMoSim: A Scalable Simulation

ially with Environment for Wireless and Wired Network Systems.http://pcl.cs.ucla.edu/projects/glomosim/.

monstratei few base APPENDIX Ie stations. STEADY- STATE PROBABILITY X,

for these The balance equations can be written as:

f capacity mAx = (n + l)pxn±1, n> 2

Let p = mA/,u, we have xn as functions of X2:

pn-2zXn = 2 1 X2, n > 2

s and Trade- n.witz, editor, We then obtain the following:'01), Lecture p(, + p)

)nymity and 2and PrivacyPrivacv bv X1L PXO, Xpp X0-1 I v" y Uy

t system forrrath, editor,es 115-129,

Long Term;on, editors,(PET 2002),2002.nconditionalgy, 1(1):65-

and Anti-s. In IEEEorks (DSN),

Using the normalization condition yields00

XO + XIL + X1F + ZXn =

n=2

[ ~ ~ 0n-2 2xo n!

So

p°0 (1 +p)eP

1

1

1

exPRESH, 3*MeanDelayexPRESH, 2*MeanDelayexPRESH, 1*MeanDelay

PRESH

-El

exPRESH, 3*MeanDelayexPRESH, 2*MeanDelayexPRESH, 1*MeanDelay

PRESH

I