Bhaskar-DataCentric

8/3/2019 Bhaskar-DataCentric

http://slidepdf.com/reader/full/bhaskar-datacentric 1/19

Modelling Data-Centric Routing in Wireless Sensor Networks

Bhaskar Krishnamachari, Deborah Estrin, and Stephen Wicker

Computer Engineering Technical Report CENG 02-14

Department of Electrical Engineering – SystemsUniversity of Southern California

Los Angeles, California 90089-2562

213-740-4465



Modelling Data-Centric Routing

in Wireless Sensor Networks

Bhaskar Krishnamachari∗, Deborah Estrin †, Stephen Wicker‡

Abstract

Sensor networks differ from traditional networksin several ways: sensor networks have severe en-ergy constraints, redundant low-rate data, andmany-to-one flows. The end-to-end routingschemes that have been proposed in the liter-ature for mobile ad-hoc networks are not appro-priate under these settings. Data-centric tech-nologies are needed that perform in-network ag-gregation of data to yield energy-efficient dis-semination. In this paper we model data-centricrouting and compare its performance with tradi-tional end-to-end routing schemes. We examinethe impact of source-destination placement andcommunication network density on the energycosts, delay, and robustness of data aggregation.We show that data-centric routing offers signif-icant performance gains across a wide range of operational scenarios.

∗B. Krishnamachari is at the Dept. of Electrical Engi-

neering - Systems at the University of Southern Califor-

nia, Los Angeles, CA 90089, E-mail: [email protected]

†D. Estrin is at the UCLA Computer ScienceDepartment, Los Angeles, CA 90095, E-mail: de-

[email protected].‡S. Wicker is at the School of Electrical and Com-

puter Engineering, Cornell University, Ithaca, NY 14853,

E-mail: [email protected].

1 Introduction

The wireless sensor networks of the near future

are envisioned to consist of hundreds to thou-sands of inexpensive wireless nodes, each withsome computational power and sensing capabil-ity, operating in an unattended mode. Theyare intended for a broad range of environmen-tal sensing applications from vehicle tracking tohabitat monitoring [4], [28], [35]. The hardwaretechnology for these networks - low cost proces-sors, miniature sensing and radio modules arehere today, with further improvements in costand capabilities expected within the next decade

[4], [16], [19], [28], [29]. The applications, net-working principles and protocols for these sys-tems are just beginning to be developed [9], [10],[14], [28].

Wireless sensor networks are similar to mo-bile ad-hoc networks (MANETs) in that both in-volve multi-hop communications. However, thenature of the applications and routing require-ments for the two are significantly different inseveral respects. First, the typical mode of com-munication in a sensor network is from multi-

ple data sources to a data recipient/sink - a sortof a reverse-multicast, rather than communica-tion between any pair of nodes. Second, sincethe data being collected by multiple sensors isbased on common phenomena, there is likely to

1



be some redundancy in the data being communi-

cated by the various sources in sensor networks.Third, in most envisioned scenarios the sensorsare not mobile (though the sensed phenomenamay be), so the nature of the dynamics in thetwo networks is different. Finally, the single ma- jor resource constraint in sensor networks is thatof energy. The situation is much worse than eventhat in MANETs, where the communicating de-vices handled by human users can be replacedor recharged relatively often. The scale of sensornetworks and the necessity of unattended oper-

ation for months at a time means that energyresources have to be managed even more care-fully. This, in turn, precludes really high datarate communication.

For these reasons the many end-to-end routingprotocols that have been proposed for MANETsin recent years are not suitable for wireless sensornetworks. Alternative approaches are required.

Data aggregation has been put forward as aparticularly useful paradigm for wireless rout-ing in sensor networks [13], [17]. The idea is tocombine the data coming from different sources

enroute – eliminating redundancy, minimizingthe number of transmissions and thus saving en-ergy. This paradigm shifts the focus from thetraditional address-centric approaches (findingshort routes between pairs of addressable end-nodes) to a more data-centric approach (findingroutes from multiple sources to a single destina-tion that allows in-network consolidation of re-dundant data).

We do not propose any new protocols in thispaper, but rather attempt to address at a higher

level the gains and tradeoffs that can be achievedby using the data-centric approach as opposed tothe traditional address-centric approach.

The rest of the paper is organized as follows:in section 2, we define simple models of address-

centric and data-centric routing protocols for the

purpose of analysis. Since data aggregation isthe key concept in data-centric routing, we ex-amine optimal and suboptimal data aggregationmethods, measures of performance and factorsaffecting these measures in section 3. In section4 we present some theoretical results pertainingto the energy gains that can be achieved usingdata aggregation. This is followed by experimen-tal results showing how the various suboptimaldata-centric approaches compare with the opti-mal address-centric routing in terms of the aver-

age number of transmissions required per datumdelivered to the data sink. Section 5 presentsexperimental results showing the effect of sourceplacement, number and network density on thedelay due to aggregation latency. In section 6 wediscuss experimental results showing that dataaggregation results in some degree of robustnessto dynamics of the sensed phenomena. We cau-tion the reader about the assumptions and omis-sions in our modelling in section 7. Our work isplaced in the context of other relevant work insection 8. We present our conclusions in section

9.

2 Routing Models

We focus our attention on a single network flowthat is assumed to consist of a single data sinkattempting to gather information from a num-ber of data sources. We start with simple mod-els of routing schemes which use data aggrega-tion (which we term data-centric), and schemes

which do not (which we term address-centric). Inboth cases we assume there are some common el-ements - the sink first sends out a query/interestfor data, the sensor nodes which have the appro-priate data then respond with the data. They

2



Figure 1: Illustration of AC versus DC routing

differ in the manner the data is sent from thesources to the sink:

Address-centric Protocol (AC): Eachsource independently sends data along the short-est path to sink based on the route that thequeries took ( “end-to-end routing” ).

Data-centric Protocol (DC): The sources

send data to the sink, but routing nodes en-route look at the content of the data and performsome form of aggregation/consolidation functionon the data originating at multiple sources.

Figure 1 is a simple illustration of the dif-ference between AC and DC schemes. In theaddress-centric approach, each source sends itsinformation separately to the sink (source 1 rout-ing the data labelled “1” through node A, andsource 2 routing the data labelled “2” throughnodes C and B). In the data centric-approach,

the data from the two sources is aggregated atnode B, and the combined data (labelled “1+2”)is sent from B to the sink. The latter resultsin energy savings as fewer transmissions are re-quired to send the information from both sources

to the sink.

2.1 Differentiating Scenarios

One of goals in this paper is to understand thecontext in which data aggregation is useful. It ishelpful to consider the following scenarios clas-sified in terms of the type and dynamics of thedata sent by the sources:

1. All sources send completely different infor-mation (no redundancy)

2. All sources send identical information (com-plete redundancy)

3. The sources send information with some in-termediate, non-deterministic, level of re-dundancy

In case 1, Data aggregation cannot be per-formed - both AC and DC protocols will incurthe same number of transmissions for the sink toreceive all the data. In case 2, the AC protocolcan be modified to do as well as or even better

than the DC protocol by having the sink moni-tor the incoming information, realize that thereis duplicate information coming in, and ask allbut one of the sources to stop transmitting (as-suming that this duplication is sustained over aperiod of time). In case 3, however, the AC pro-tocol cannot be modified to do much better thanthe DC protocol - the non-deterministic natureof the redundancy implies that the sink cannotrequest some sources to shut down. Since allsources are required to transmit, and there is

some redundancy in their transmission, the bestoption is to aggregate the information comingfrom these sources, which is what the DC pro-tocol does. For the rest of the paper, it will beassumed that this scenario holds.

3



We now look at the notion of data aggregation

in more detail.

3 Data Aggregation

Data aggregation is the combination of datafrom different sources, and can be implementedin a number of ways. The simplest data ag-gregation function is duplicate suppression - inthe example of figure 1, if sources 1 and 2 bothsend the same data, node B will send only oneof these forward. Duplication suppression is al-

ready practiced in commercial wireless messag-ing networks. Other aggregation functions couldbe max , min , or any other function with multipleinputs. For our modelling purposes in this paperwe make a simplifying assumption - the aggre-gation function is such that each intermediatenode in the routing transmits a single aggregatepacket even if it receives multiple input packets.We will refer to the information received by thesink when it has obtained the messages transmit-ted by all sources in a given flow (whether or not

these messages are aggregated) as a “datum”.

3.1 Optimal Aggregation

Say there are k sources, labelled S 1 through S k,and a sink, labelled D. Let the network graphG = (V, E ) consist of all the nodes V , with E consisting of edges between nodes that can com-municate with each other directly. With the as-sumption that the number of transmissions fromany node in the data aggregation tree is exactlyone, the data aggregation tree can be thought

of as the reverse of a multicast tree: instead of a single source sending a packet to all receivers,all the sources are sending a single packet to thesame receiver. It is well-known that the mul-ticast tree with a minimum number of edges is

a minimum Steiner tree on the network graph.

The following can therefore be readily obtained:Result 1 : The optimum number of transmis-

sions required per datum for the DC protocol isequal to the number of edges in the minimumSteiner tree in the network which contains thenode set (S 1,...S k, D).

Corollary : Assuming an arbitrary placementof sources, and a general network graph G, thetask of doing DC routing with optimal data ag-gregation is NP-hard.

The latter follows from the NP-completeness

of the minimum Steiner problem on Graphs [11].

3.2 Suboptimal Aggregation

The following are three generally suboptimalschemes for generating data aggregation treesthat we examine in this paper.

1. Center at Nearest Source (CNS): Inthis data aggregation scheme, the sourcewhich is nearest the sink acts as the aggrega-tion point. All other sources send their data

directly to this source which then sends theaggregated information on to the sink.

2. Shortest Paths Tree (SPT) : In this dataaggregation scheme, each source sends itsinformation to the sink along the shortestpath between the two. Where these pathsoverlap for different sources, they are com-bined to form the aggregation tree.

3. Greedy Incremental Tree (GIT) : Inthis scheme the aggregation tree is built se-

quentially. At the first step the tree consistsof only the shortest path between the sinkand the nearest source. At each step afterthat the next source closest to the currenttree is connected to the tree.

4



This is by no means an exhaustive list, but is

representative of some of the data aggregationtree heuristics that can be implemented.

3.3 Performance measures

In exploring the gains and tradeoffs involved indata-centric protocols, we need to specify perfor-mance measures of interest. Three are examinedin some detail in this paper:

• Energy Savings: By aggregating the in-formation coming from the sources, the

number of transmissions is reduced, trans-lating to a savings in energy.

• Delay: There is latency associated with ag-gregation. Data from nearer sources mayhave to be held back at intermediate nodesin order to combine them with data fromsources that are farther away.

• Robustness: Somewhat related to the firstmeasure is the fact that with data aggrega-tion there is a decrease in the marginal en-ergy cost of connecting additional sources tothe sink. This can be considered as provid-ing some degree of robustness to dynamicsin the sensed phenomena.

3.4 Source Placement Models

The chief factors that can affect the performanceof data aggregation methods are the positionsof the sources in the network, the number of sources, and the communication network topol-

ogy. In order to investigate these factors, westudy two models of source placement, the event-radius (ER) model, and the random sources (RS)model. In both models, we generate a sensor net-work by scattering n sensor nodes randomly in a

Figure 2: Illustration of the event-radius modelfor source positions

unit square. One of these nodes is the data sink.All nodes are assumed to be able to communi-cate with any other nodes that are within somedistance R (the communication radius). The lo-cation of the data sources depends on the modelsas follows:

• Event-Radius Model: In this model, asingle point in the unit square is defined asthe location of an “event.” This may cor-respond to a vehicle or some other phe-nomenon being tracked by the sensor nodes.All nodes within a distance S (called thesensing range) of this event that are notsinks are considered to be data sources. Theaverage number of sources is approximatelyπ∗S 2∗n (somewhat less than this if we takeinto account boundary effects). This modelis shown in figure 2.

• Random-Sources Model: In this model,k of the nodes that are not sinks are ran-domly selected to be sources. Unlike in the

5



Figure 3: Illustration of the random-sourcesmodel for source positions

event-radius model, the sources are not nec-essarily clustered near each other. This isillustrated in figure 3.

4 Energy Savings due to Data

Aggregation4.1 Theoretical Results

We now give some analytical bounds on the en-ergy costs and savings that can be obtained withdata aggregation, based on the distances be-tween the sources and the sink, and the inter-distances among the sources. The upshot of thissection is that the greatest gains due to data ag-gregation are obtained when the sources are allclose together and far away from the sink.

Let di be the distance of the shortest path fromsource S i to the sink in the graph. Per datum,the total number of transmissions required forthe optimal AC protocol in this case (call it N A)is:

N A = d1 + d2 + ...dk = sum(di) (1)

How well can the optimal DC protocol do?Result 2 : Let the number of transmissions re-

quired for the optimal DC protocol be N D. ThenN D ≤ N A.

Proof : Doing data-aggregation optimally canonly decrease the minimum number of edgesneeded compared to the situation when sourcessend information to the sink along shortestpaths.

Definition : The “diameter” X of a set of nodesS in a graph G is the maximum of the pair-wise shortest paths between these nodes X =maxi,j∈S SP (i, j) where SP (i, j) is the shortestnumber of hops needed to go from node i to j inG.

Result 3 : If the source nodes S 1, S 2, . . . S khave a diameter X ≥ 1. The total number of transmissions (N D) required for the optimal DCprotocol satisfies the following bounds:

N D ≤ (k − 1)X + min(di) (2)

N D ≥ min(di) + (k − 1) (3)

Proof : (2) can be obtained by a construc-tion - the data aggregation tree which consists of (k − 1) sources sending their packets to the re-maining source which is nearest to the sink. Thistree has no more than (k−1)X + min(di) edges,hence the optimum tree must have no more thanthis. (3) is obtained by considering the smallestpossible Steiner tree which would happen if the

diameter were 1. In this case, the shortest pathfrom the source node at min(di) must be part of the minimum Steiner tree, and there is exactlyone edge from each of the other source nodes tothis node.

6



Result 4 : If the diameter X < min(di), then

N D < N A. In other words, the optimum data-centric protocol will perform better than the ACprotocol.

Proof :

⇒ N D < (k − 1)X + min(di) < (k)min(di)

⇒ N D < sum(di) = N A. (4)

Definition : Let us define the fractional sav-ings, F S , obtained by using the DC protocol as

opposed to the AC protocol as follows:

F S = (N A −N D)/(N A) (5)

F S can range from 0 (no savings) to 1 (100percent savings). The following are the lowerand upper bounds on F S , which follow directlyfrom (2) and (3) and the above definition.

Result 5 : The fractional savings FS satisfiesthe following bounds:

F S ≥ 1− ((k − 1)X + min(di))/sum(di) (6)F S ≤ 1− (min(di) + k − 1)/sum(di) (7)

To clarify the matter, assume that all thesources are at the same shortest-path distancefrom the sink. i.e. min(di) = max(di) = d.

Then we have that

1−((k − 1)X + d)

kd≤ F S

≤ 1−(d + k − 1)

(kd) (8)

Result 6 : Assume X and k are fixed, then asd tends to infinity (i.e. as the sink is farther andfarther away from the sources):

limd→∞F S = 1− 1/k. (9)

Proof :In the limit, X << d, and k << d. It suffices

to show that both lower and upper bounds in (8)converge to the same right hand side value:

limd→∞

1−

(k − 1)X + d

kd

= limd→∞1−(k − 1)X

kd

−d

kd = 1− 1/k(10)

and

limd→∞

1−

(d + k − 1)

(kd)

= limd→∞

1−

d

kd−

(k − 1)

kd

= 1− 1/k (11)

Essentially, what Result 6 tells us is quite in-tuitive. If the distance between the sink and

the sources is large compared to the distance be-tween the sources, then the optimal DC protocolgives k-fold savings. When there are 4 sourcesthat are close together and located far-away fromthe sink, then the AC protocol will have about4 times as many transmissions, i.e. there areroughly 75 % fewer transmissions with data ag-gregation. When there are 10 such sources, thegains are nearly 90 %, and so on...

Result 7 : If the subgraph G of the commu-nication graph G induced by the set of source

nodes (S 1, . . . S k) is connected, the optimal dataaggregation tree can be formed in polynomialtime.

Proof : The proof is constructive. Start GIT.The tree is initialized with the path from the

7



sink to the nearest source. At each additional

step of the GIT, the next source to be con-nected to the tree is always exactly one stepaway (such a source is guaranteed to exist sinceG is connected). At the end of the construc-tion, the number of edges in the tree is thereforedmin + (k − 1), which is the lower bound givenin relation (3). Hence the lower bound is tightand therefore optimal. The GIT constructionruns in polynomial time w.r.t. the number of nodes [33]. Hence although finding the optimaldata aggregation tree is NP-hard in general, in

this particular situation, we have a polynomialspecial-case.

Result 8 : In the ER model, when R > 2S ,the optimal data aggregation tree can be formedin polynomial time.

Proof : It is easy to see that when R > 2S , allsources are within one hop of each other. Thisis therefore a special case of result 8. Under thiscondition, both GIT and CNS schemes will resultin the optimal data aggregation tree.

4.2 Experimental ResultsWe now present our experimental results show-ing the energy costs of AC and DC protocolsfor both the ER and RS source placement mod-els. The experimental setup is as follows: for theER model, 5 evenly spaced values of the sensingrange S from 0.1 to 0.3 are tested, while for theRS model the number of sources k is varied 1to 15 in increments of 2. For both models thecommunication radius R is varied from 0.15 to0.45 in increments of 0.05. For each combina-

tion of S or k and R 100 experiments were run.Each experiment consists of a random placementof the n = 100 nodes including the sink node ina square are of unit size. In some cases (par-ticularly when the values of E or R are low) a

particular experiment may result in unconnected

graphs or no sources; the measurements fromthese cases are not taken into account while com-puting the averages. The error-bars shown in theplots represent the standard error in the mean.

Figures 4 and 7 show a 3D surface plot of thenumber of transmissions in the ideal AC proto-col for both models. In both cases these costsare greatest when a) the number of sources islarge and b) the communication range is low.The former is obvious as the number of trans-missions should increase with sources; the latter

is understandable since the number of hops be-tween the sources and the sink (indeed betweenany two nodes in the network) is high when thecommunication radius is low.

Figure 5 compares the transmission energycosts of the various protocols as the communi-cation range is varied, keeping the sensing rangeconstant at 0.2 (which corresponds to about 12.5sources on average, ignoring edge-effects). Atthe very bottom is the lower bound on N D givenin relation 3. In this figure it can be seen thatthe GITDC seems to coincide with the lower

bound all throughout. This is because when S is even of moderate length, with high probabil-ity, the subgraph which lies within the circle of radius S around the event is connected, and re-sult 7 holds. The performance of the CNSDCapproaches optimal as R increases, as per result8. The SPTDC protocol also performs well allthrough. In all cases there is a 50− 80% savingscompared to the AC protocol. Figure 8 is theequivalent plot for the Random Sources Model.The first thing to note is that the lower bound

is no longer tight, since the sources are placedrandomly anywhere in the network and unlessthe network is dense (high R) the sources areunlikely to be within one hop of each other. Inthis setting the GITDC performs the best, fol-

8



Figure 4: Cost of address-centric routing (av-erage number of transmissions) in event-radiusmodel

Figure 5: Comparison of energy costs versuscommunication radius in event-radius model

Figure 6: Comparison of energy costs versussensing range in event-radius model

Figure 7: Cost of address-centric routing (av-erage number of transmissions) in the random-sources model

9



Figure 8: Comparison of energy costs versuscommunication radius in random-sources model

Figure 9: Comparison of energy costs versusnumber of sources in random-sources model

lowed by SPTDC, CNSDC and AC, respectively.

CNSDC performs poorly in this setting since thesources are far apart and it doesn’t pay to alwaysaggregate at the source nearest to the sink.

Figures 6 and 9 both show that the transmis-sion costs increase as the number of sources isincreased. In the event-radius model, it can beseen that the CNSDC protocol performs poorlywhen the sensing range is really large. WhenS = 0.3, nearly a third of all nodes in the ex-periments act as sources and for many of thesesources it may be faster to route directly to the

sink rather than through one particular sourcethat is closest to the sink. Figure 9 shows thatthe gains due to a good data aggregation tech-nique (like GITDC) can be very significant whenthe number of sources is high.

To summarize, our experiments show that theenergy gains due to data aggregation can bequite significant with SPTDC or GITDC partic-ularly when there are are a lot of sources (largeS or large k) that are many hops from the sink(small R).

5 Delay due to Data Aggrega-

tion

Although data aggregation results in fewer trans-missions, there is a tradeoff - potentially greaterdelay because data from nearer sources may haveto be held back at an intermediate node in orderto be aggregated with data coming from sourcesthat are farther away. This can be seen by refer-ring back to figure 1; in figure 1b, node B which

acts as the aggregating node for sources 1 and2, is only one hop from source 1 but is two hopsfrom source 2. Thus if both sources transmit thedata simultaneously, the data from source 1 willget to B before the data from source 2 and take

10



longer to get to the sink than it would in the

no aggregation scheme shown in figure 1a. Notethat this delay depends on the aggregation func-tion - for some simple kinds of data aggregationsuch as duplicate suppression, there is no needfor data to be withheld at an aggregating node.For more complicated forms of data aggregation,where the output aggregated packet depends onthe combination of multiple input packets thisdelay is an issue.

It can be seen that, in the worst case, the la-tency due to aggregation will be proportional tothe number of hops between the sink and thefarthest source. When no aggregation is em-ployed, the delay between the time when the var-ious sources transmit data and the sink receivesthe first packet is proportional to the number of hops between the sink and the nearest source.Hence one way to quantify the effect of aggrega-tion delay is to examine the difference betweenthese two distances. This is shown in figures 10-

13. The experimental setup is the same as dis-cussed in section 4.2. The upper curve in allthese figures is representative of the latency de-lay in DC schemes with non-trivial aggregationfunctions and the lower curve is representativeof the latency delay in AC schemes. The dif-ference between these curves is greatest in bothmodels when the communication radius is low,and when the number of sources is high. In fig-ure 13, as the number of sources increases thetwo curves saturate to extreme values. The up-

per curve saturates to a value of about 4 whichis about the maximum number of hops betweenthe sink and any node in the network. The lowercurve saturates at a value close to the minimumnumber of hops (1).

6 Robustness due to Data Ag-

gregation

As indicated before, with data aggregation thereis a lower marginal energy cost of connecting ad-ditional sources to the sink as opposed to the ACapproach. Consider the GITDC protocol, for ex-

ample. At each step, the energy cost (in termsof additional edges, i.e. additional number of transmissions per datum) of connecting an ad-ditional source is simply the shortest distance of that source to the aggregation tree at the cur-rent step. In the AC protocol, however, the costof adding an additional source is the distance of that source all the way to the sink. Figures 14and 15 show this relationship. The x-axis rep-resents the number of sources connected to thesink and the y-axis represents the total numberof transmissions required for all sources to com-

municate to the sink. In both models of sourceplacement, ER as well as RS, for a given en-ergy budget, a greater number of sources can beconnected to the sink. Thus the energy savingswith aggregation can be transformed to providea greater degree of robustness to dynamics in thesensed phenomena. For example if at any giventime only a fraction of all sources can give ac-curate readings of the sensed phenomena, thena greater number of such readings can be pro-cessed and sent to the sink in the DC protocol

even if both schemes use the same amount of total energy. The gains are, of course, smallerin the random-sources model since there is lessscope for aggregation when the sources are ran-domly located.

11



Figure 10: Distance of sink to nearest andfarthest source versus communication radius inevent-radius model

7 Shortcomings of the Mod-

elling

The reader should note the simplifying assump-

tions we have made in the above analysis andsome performance issues that have not beentaken into account in our modelling. First, wehave chosen to make a somewhat stark contrastbetween routing protocols which do and do notuse data aggregation in order to highlight the ef-fect of data aggregation on routing performance.It can be argued that our categorization of pro-tocols into address-centric versus data-centric onthis basis is overly simplistic, and has ignoredthe possibility of hybrid protocols which com-

bine the advantages of both. For example, onecan conceive of the sources finding routes to thesink in a traditional end-to-end manner, whilethe intermediate nodes are endowed with someintelligence and can perform data aggregation on

Figure 11: Distance of sink to nearest and far-thest source versus sensing range in event-radiusmodel

messages passing through them.

Second, in discussing the energy costs of therouting protocols we have not considered anyof the overhead costs involved in setting up or

maintaining the routing paths from the sourcesto the sink. There are two reasons for this - oneis that to do so we would have been forced totake into account specifics about how the routesare set up and this would affect the generalityof the analysis, the second is that it is not clearthat taking these costs into account would haveany differential impact on AC versus DC proto-cols. In addition to energy savings due to dataaggregation, there will also be some additionalenergy savings through the avoidance of globally

unique IDs, which can require a significant num-ber of bytes in the small packets typical of sensornetworks. Such savings have not been modelledhere.

Third, our analysis of the delay focused only

12



Figure 12: Distance of sink to nearest andfarthest source versus communication radius inrandom-sources model

on the latency due to aggregation. There aretwo other possible sources of delay that we didnot take into account - processing delay and de-lay due to congestion. There is likely to be an

additional delay in data-centric protocols due tothe processing that needs to be performed byaggregating nodes. It could be argued that thisprocessing delay is a second order effect and un-likely to be as significant as the latency delaywe analyzed. We chose not to model the de-lay due to congestion as this would depend ona number of additional details such as the MACprotocol used, the traffic in the network; also,again, it is not clear that congestion delay wouldhave a differential impact on data centric versus

address-centric protocols.Finally, our analysis has focused on the case

where there is a single sink. Although this isa reasonable scenario for many applications, itis reasonable to ask what would happen if there

Figure 13: Distance of sink to nearest and far-thest source versus number of sources in random-sources model

were additional sinks. One solution is to think of the different flows in that case as a superpositionof many single sink data-flows. However, thiswould yield an over-estimate of the energy costs,

as further aggregation savings can be possibleif there are redundancies in the sources and thedata being requested by the various sinks. Thiswill be a topic for future study.

8 Related Work

The use of sensor networks has been envisionedin a range of settings such as industrial appli-cations [35], vehicle tracking applications [28]

and habitat monitoring [4]. A number of inde-pendent efforts have been made in recent yearsto develop the hardware and software architec-tures needed for wireless sensing. Of particularnote are UC Berkeley’s Smart Dust Motes [19],

13



Figure 14: Average number of transmissions re-quired to connect to a given number of sourcesin event-radius model

TinyOS [16], and the PicoRadio [29] project; theWireless Integrated Network Sensors (WINS)project [28] and PC-104 based sensors [4] de-veloped at University of California Los Angeles;

and the µAMPS project at MIT [23]. The chal-lenges and design principles involved in network-ing these devices are discussed in [9], [10], and[22]. Energy-efficient medium access schemes ap-plicable for sensor networks are presented in [7],[31] and [37]. Techniques for balancing the en-ergy load among sensors using randomized rota-tion of cluster heads are discussed in [15]. Someattention has also been given to developing lo-calized self-configuration mechanisms in sensornetworks [5].

The great majority of wireless routing proto-cols developed in recent years have been for mo-bile ad-hoc communication networks [27]. De-pending on whether the routes are maintainedat all times or if they are created afresh when

Figure 15: Average number of transmissions re-quired for sink to receive data from a given num-ber of sources in random-sources model

needed, these are categorized into proactive [24],reactive [18], [20], [25], [26], or hybrid [12] proto-cols. Some work has also been directed to in-corporating GPS-like geographical information

with the routing technique [1], [21]. These ap-proaches are all address-centric, in that they arefocused on end-to-end routing between pairs of addressable nodes. There has also been work onaddress-centric routing protocols that conserveenergy and maximize the system lifetime. Inthese schemes, which are applicable to sensornetworks, routing metrics that incorporate en-ergy expenditure considerations are defined foreach link [6], [32].

The application-specific nature of sensor net-

works leads to the alternative approach we havedescribed in this paper as data-centric. Themeta-naming of data is suggested in [14] as ameans to reduce transmission of redundant datafor flooding-like schemes for information dissem-

14



ination. Directed diffusion [17] is the protocol

that is most like the data-centric routing modelsanalyzed in this paper. In directed diffusion, allnodes are application-aware and communicatenamed data. The benefits of application specific,in-network processing and data-aggregation arequantified through experimental results in [13].In the experimental set-up described in that pa-per, the form of data aggregation used (duplicatesuppression) reduces the traffic by up to 42% forfour sources.

The notion of in-network processing during

routing is not unique to sensor networks alone.In Active Networks, intermediate routing nodescan perform customized computations on andmodify the contents of messages passing throughthem on a per-user or per-application basis [34].Limited router-assist techniques have also beenproposed for multicast on the internet whichwould permit intermediate routers to look atspecial router-assist fields on packets in order toeliminate redundant signaling and perform sub-casting [3].

Optimal data aggregation, as we have shown

in this paper, requires the formation of a min-imum Steiner tree, a well known NP-completeproblem arising in many networking contexts[36]. The greedy incremental tree (GIT) heuris-tic scheme described in our paper (also knownas the nearest participant first algorithm) isa well-known approximation algorithm for thisproblem[33]. It is known to have an approx-imation ratio of 2 (i.e. the tree that it out-puts can have no more than 2 times as manyedges as the optimum). A distributed version

of this algorithm is discussed in [2]. The bestknown approximation algorithm for the mini-mum Steiner tree problem has an approximationratio of about 1.55 [30].

Finally, we mention here in passing that there

is another sense in which the phrase “data-

centric networking” has been used [8]; namelyto describe an approach to ubiquitous comput-ing in which human users are identified not withstatic computing devices but with their person-alized services and data.

9 Conclusions

We have modelled and analyzed the performanceof data-centric routing in wireless sensor net-works. We identified and investigated some of the factors affecting performance, such as thenumber of placement of sources, and the commu-nication network topology. The formation of anoptimal data aggregation tree is generally NP-hard. We presented some suboptimal data ag-gregation tree generation heuristics and showedthe existence of polynomial special cases.

The modelling tells us that whether the

sources are clustered near each other or locatedrandomly, significant energy gains are possiblewith data aggregation. These gains are great-est when the number of sources is large, andwhen the sources are located relatively close toeach other and far from the sink. We discussedhow these energy gains can be translated intoincreased robustness to dynamics in sensed phe-nomena. The modelling, though, also seems tosuggest that aggregation latency could be non-negligible and should be taken into consideration

during the design process. Data-centric architec-tures such as directed diffusion should support aType of Service (TOS) facility that would permitapplications to effect desired tradeoffs betweenlatency and energy.

15



Acknowledgments

The authors would like to acknowledge the helpof Chalermek Intanagonwiwat.

References

[1] S. Basagni, I. Chlamtac, V. R. Syrotiuk and B. AWoodward. “A distance routing effect algorithmfor mobility (DREAM),” In ACM/IEEE Inter-national Conference on Mobile Computing and Networks (MOBICOM’98), pages 76-84, October1999, Dallas, Texas.

[2] F. Bauer and A. Varma. “Distributed algorithmsfor multicast path setup in data networks,” in Transactions on Networking , vol. 4, no. 2, 181-191, 1996.

[3] B. Cain, T. Speakman, D. Towsley, “GenericRouter Assist (GRA) Building Block Motiva-tion and Architecture,” RMT Working Group,Internet-Draft <draft-ietf-rmt-gra-arch-01.txt >,work in progress, March 2000.

[4] A. Cerpa et al., “Habitat monitoring: Appli-cation driver for wireless communications tech-nology,” 2001 ACM SIGCOMM Workshop on Data Communications in Latin America and theCaribbean , Costa Rica, April 2001.

[5] A. Cerpa and D. Estrin, “ASCENT: AdaptiveSelf-Configuring sEnsor Networks Topologies,”unpublished .

[6] J.-H. Chang, L. Tassiulas, “Energy ConservingRouting in Wireless Ad-hoc Networks,” INFO-COM , 2000.

[7] Seong-Hwan Cho and A. Chandrakasan, “Energy

Efficient Protocols for Low Duty Cycle Wire-less Microsensor Networks”, ICASSP 2001, May2001.

[8] M. Esler, J. Hightower, T. Anderson, and G. Bor-riello “Next Century Challenges: Data-Centric

Networking for Invisible Computing: The Por-

tolano Project at teh University of Washington,”ACM/IEEE International Conference on MobileComputing and Networks (MobiCom ’99), Seat-tle, Washington, August 1999.

[9] D. Estrin, R. Govindan, J. Heidemann and S.Kumar, “Next Century Challenges: Scalable Co-ordination in Sensor Networks,” ACM/IEEE In-ternational Conference on Mobile Computing and Networks (MobiCom ’99), Seattle, Washington,August 1999.

[10] D. Estrin, L. Girod, G. Pottie, and M. Sri-

vastava, “Instrumenting the World with Wire-less Sensor Networks,” International Confer-ence on Acoustics, Speech and Signal Processing (ICASSP 2001), Salt Lake City, Utah, May 2001.

[11] M. R. Garey and D. S. Johnson, Computers and Intractability: A Guide to the Theory of NP-completeness, Freeman, San Francisco, 1979.

[12] Z. J. Haas, and M. R. Pearlman, “The ZoneRouting Protocol (ZRP) for Ad-Hoc Networks,”IETF MANET, Internet-Draft , December 1997.

[13] J. Heidemann, F. Silva, C. Intanagonwiwat, R.Govindan, D. Estrin, and D. Ganesan, “Build-ing Efficient Wireless Sensor Networks with Low-Level Naming,” 18th ACM Symposium on Oper-ating Systems Principles, October 21-24, 2001.

[14] W.R. Heinzelman, J. Kulik, and H. Balakrish-nan “Adaptive Protocols for Information Dis-semination in Wireless Sensor Networks,” Pro-ceedings of the Fifth Annual ACM/IEEE Inter-national Conference on Mobile Computing and Networking (MobiCom ’99), Seattle, Washing-ton, August 15-20, 1999, pp. 174-185.

[15] W.R. Heinzelman, A. Chandrakasan, and H.Balakrishnan “Energy-Efficient CommunicationProtocol for Wireless Microsensor Networks,”33rd International Conference on System Sci-ences (HICSS ’00), January 2000.

16



[16] J. Hill et al., “System Architecture Directions

for Networked Sensors,” ASPLOS , 2000.[17] C. Intanagonwiwat, R. Govindan and D. Es-

trin, “Directed Diffusion: A Scalable and RobustCommunication Paradigm for Sensor Networks,”ACM/IEEE International Conference on MobileComputing and Networks (MobiCom 2000), Au-gust 2000, Boston, Massachusetts

[18] D.B. Johnson, and D.A. Maltz, “DynamicSource Routing in Ad-Hoc Wireless Networking,”in Mobile Computing , T. Imielinski and H. Korth,editors, Kluwer Academic Publishing, 1996.

[19] J. M. Kahn, R. H. Katz and K. S. J. Pister, “Mo-bile Networking for Smart Dust”, ACM/IEEE International Conference on Mobile Computing and Networking (MobiCom 99), Seattle, WA, Au-gust 17-19, 1999

[20] Karp, B. and Kung, H.T., Greedy PerimeterStateless Routing for Wireless Networks, in Pro-ceedings of the Sixth Annual ACM/IEEE Inter-national Conference on Mobile Computing andNetworking (MobiCom 2000), Boston, MA, Au-gust, 2000, pp. 243-254.

[21] Y.B. Ko and N.H. Vaidya, “Location-aidedrouting (LAR) in mobile ad hoc networks,”ACM/IEEE International Conference on MobileComputing and Networking (MobiCom), 1998,pp. 66-75.

[22] W.W. Manges, “Wireless Sensor NetworkTopologies,” Sensors Magazine, vol. 17, no. 5,May 2000.

[23] Rex Min, Manish Bhardwaj, Seong-Hwan Cho,Amit Sinha, Eugene Shih, Alice Wang, andAnantha Chandrakasan, “Low-Power WirelessSensor Networks”, VLSI Design 2000 , January

2001.

[24] S. Murthy, and J.J. Garcia-Luna-Aceves, “AnEfficient Routing Protocol for Wireless Net-works,” MONET , vol. 1, no. 2, pp. 183-197, Oc-tober 1996.

[25] V.D. Park, and M.S. Corson, “A Highly Adap-

tive Distributed Routing Algorithm for MobileWireless Networks,” IEEE INFOCOM ’97 , Kobe,Japan, 1997.

[26] C. Perkins, “Ad Hoc On-Demand Distance Vec-tor (AODV) Routing,” IETF MANET , Internet-Draft, December 1997.

[27] C. Perkins, Ad Hoc Networking , Addison-Wesley, 2000.

[28] G.J. Pottie, W.J. Kaiser, “Wireless IntegratedNetwork Sensors,” Communications of the ACM ,

vol. 43, no. 5, pp. 551-8, May 2000.

[29] J. Rabaey, J. Ammer, J.L. da Silva Jr. and D.Patel, “PicoRadio: Ad-Hoc Wireless Network-ing of Ubiquitous Low-Energy Sensor/MonitorNodes ,” Proceedings of the IEEE Computer Soci-ety Annual Workshop on VLSI (WVLSI’00), Or-lando, Florida, April 2000.

[30] G. Robins, A. Zelikovsky, “Improved SteinerTree Approximation in Graphs,” Proc. of ACM/SIAM Simposium on Discrete Algorithms,pp. 770-779, 2000.

[31] S. Singh, C.S. Raghavendra, “Power efficientMAC protocol for multihop radio networks,” Pro-ceedings PIMRC’98 , Sept. 1998.

[32] S. Singh, M. Woo, C.S. Raghavendra, “Power-aware routing in mobile ad-hoc networks,”ACM/IEEE International Conference on Mo-bile Computing and Networks (MOBICOM’98),pages 76-84, October 1999, Dallas, Texas.

[33] H. Takahashi and A. Matsuyama, “An approxi-mate solution for the steiner problem in Graphs,”

Math. Japonica , vol. 24, no. 6, pp. 573-577, 1980.

[34] D.L. Tennenhouse, J.M. Smith, W.D. Sincoskie,D.J. Wetherall, and G.J. Minden, “A survey of active network research,” IEEE CommunicationsMagazine, vol. 35, no. 1, pp. 80-86, January 1997.

17



[35] J. Warrior, “Smart Sensor Networks of the Fu-

ture,” Sensors Magazine, March 1997.

[36] P. Winter, “Steiner Problem in Networks: A sur-vey,” Networks, vol. 17, no. 2, pp. 129-167, 1987.

[37] A. Woo and D.E. Culler, “A Transmission Con-trol Scheme for Media Access in Sensor Net-works,” ACM/IEEE International Conference on Mobile Computing and Networks (MobiCom ’01),July 2001, Rome, Italy.

18

Documents

Bhaskar-DataCentric