PowerSpy: Location Tracking using Mobile Device Power …gnakibly/papers/Usenix15.pdfPowerSpy: Location Tracking using Mobile Device Power Analysis Yan Michalevsky, Dan Boneh and Aaron

PowerSpy: Location Tracking using Mobile Device Power Analysis

Yan Michalevsky, Dan Boneh and Aaron SchulmanComputer Science Department

Stanford University

Gabi NakiblyNational Research and Simulation Center

Rafael Ltd.

Abstract

Modern mobile platforms like Android enable applica-tions to read aggregate power usage on the phone. Thisinformation is considered harmless and reading it re-quires no user permission or notification. We show thatby simply reading the phone’s aggregate power con-sumption over a period of a few minutes an applicationcan learn information about the user’s location. Aggre-gate phone power consumption data is extremely noisydue to the multitude of components and applications thatsimultaneously consume power. Nevertheless, by usingmachine learning algorithms we are able to successfullyinfer the phone’s location. We discuss several ways inwhich this privacy leak can be remedied.

1 Introduction

Our phones are always within reach and their location ismostly the same as our location. In effect, tracking thelocation of a phone is practically the same as tracking thelocation of its owner. Since users generally prefer thattheir location not be tracked by arbitrary 3rd parties, allmobile platforms consider the device’s location as sensi-tive information and go to considerable lengths to protectit: applications need explicit user permission to accessthe phone’s GPS and even reading coarse location databased on cellular and WiFi connectivity requires explicituser permission.

In this work we show that despite these restrictions ap-plications can covertly learn the phone’s location. Theycan do so using a seemingly benign sensor: the phone’spower meter that measures the phone’s power consump-tion over a period of time. Our work is based on the ob-servation that the phone’s location significantly affectsthe power consumed by the phone’s cellular radio. Thepower consumption is affected both by the distance tothe cellular base station to which the phone is currentlyattached (free-space path loss) and by obstacles, suchas buildings and trees, between them (shadowing). Thecloser the phone is to the base station and the fewer ob-stacles between them the less power the phone consumes.

The strength of the cellular signal is a major factor affect-ing the power used by the cellular radio [24]. Moreover,the cellular radio is one of the most dominant power con-sumers on the phone [10].

Suppose an attacker measures in advance the powerprofile consumed by a phone as it moves along a set ofknown routes or in a predetermined area such as a city.We show that this enables the attacker to infer the tar-get phone’s location over those routes or areas by simplyanalyzing the target phone’s power consumption over aperiod of time. This can be done with no knowledge ofthe base stations to which the phone is attached.

A major technical challenge is that power is consumedsimultaneously by many components and applications onthe phone in addition to the cellular radio. A user maylaunch applications, listen to music, turn the screen onand off, receive a phone call, and so on. All these activi-ties affect the phone’s power consumption and results ina very noisy approximation of the cellular radio’s powerusage. Moreover, the cellular radio’s power consumptionitself depends on the phone’s activity, as well as the dis-tance to the base-station: during a voice call or data trans-mission the cellular radio consumes more power thanwhen it is idle. All of these factors contribute to thephone’s power consumption variability and add noise tothe attacker’s view: the power meter only provides ag-gregate power usage and cannot be used to measure thepower used by an individual component such as the cel-lular radio.

Nevertheless, using machine learning, we show thatthe phone’s aggregate power consumption over timecompletely reveals the phone’s location and movement.Intuitively, the reason why all this noise does not misleadour algorithms is that the noise is not correlated with thephone’s location. Therefore, a sufficiently long powermeasurement (several minutes) enables the learning al-gorithm to “see” through the noise. We refer to powerconsumption measurements as time-series and use meth-ods for comparing time-series to obtain classification andpattern matching algorithms for power consumption pro-files.

In this work we use machine learning to identify the

1

routes taken by the victim based on previously collectedpower consumption data. We study three types of usertracking goals:

1. Route distinguishability: As a warm-up we askif an attacker can tell what route the user is takingfrom among a fixed set of possible routes.

2. Real-time motion tracking: Assuming the user istaking a certain known route, we ask if an attackercan identify her location along the route and trackthe device’s position on the route in real-time.

3. New route inference: Finally, suppose a user ismoving along an arbitrary (long) route. We ask ifan attacker can learn the user’s route using the previ-ously measured power profile of many (short) roadsegments in that area. The attacker composes thepower profile of the short road segments to identifythe user’s route and location at the end of the route.

We emphasize that our approach is based on measuringthe phone’s aggregate power consumption and nothingelse. In particular, we do not use the phone’s signalstrength as this data is protected on Android and iOS de-vices and reading it requires user permission. In contrast,reading the phone’s power meter requires no special per-missions.

On Android reading the phone’s aggregate power me-ter is done by repeatedly reading the following two files:

/sys/class/power supply/battery/voltage now

/sys/class/power supply/battery/current now

Over a hundred applications in the Play Store accessthese files. While most of these simply monitor batteryusage, our work shows that all of them can also easilytrack the user’s location.

Our contributions. Our work makes the following con-tributions:

• We show that the power meter available on modernphones can reveal potentially private information.

• We develop the machine learning techniques neededto use data collected from the power meter to inferlocation information. The technical details of ouralgorithms are presented in sections 4, 5 and 6, fol-lowed by experimental results.

• In Sections 8 and 9 we discuss potential continua-tion to this work, as well as defenses to prevent thistype of information leakage.

2 Threat Models

We assume a malicious application is installed on the vic-tim’s device and runs in the background. The application

has no permission to access the GPS or any other loca-tion data such as the cellular or WiFi components. Inparticular, the application has no permission to query theidentity of visible cellular base stations or the SSID ofvisible WiFi networks.

We only assume access to power data (which requiresno special permissions on Android) and permission tocommunicate with a remote server. Network connectiv-ity is needed to generate dummy low rate traffic to pre-vent the cellular radio from going into low power state.In our setup we also use network connectivity to senddata to a central server for processing. However, it maybe possible to do all processing on the phone.

As noted earlier, the application can only read the ag-gregate power consumed by the phone. It cannot mea-sure the power consumed by the cellular radio alone.This presents a significant challenge since many compo-nents on the phone consume variable amounts of powerat any given time. Consequently, all the measurementsare extremely noisy and we need a way to “see” thoughthe noise.

To locate the phone, we assume the attacker has priorknowledge of the area or routes through which the victimis traveling. This knowledge allows the attacker to mea-sure the power consumption profile of different routes inthat area in advance. Our system correlates this data withthe phone’s measured power usage and we show that, de-spite the noisy measurements, we are able to correctlylocate the phone.

For this to work we need the tracked phone to be mov-ing by a car or a bus while being tracked. Our systemcannot locate a phone that is standing still since that onlyprovides the power profile for a single location. We needmultiple adjacent locations for the attack to work.

Given the resources at our disposal, the focus of thiswork is on locating a phone among a set of local routes ina pre-determined area. A larger effort is needed to scalethe system to cover the entire world by pre-measuring thepower profile of all road segments worldwide. Neverthe-less, our localized experiments already show that track-ing users who follow a daily routine is quite possible. Forexample, a mobile device owner might choose one of asmall number of routes to get from home to work. Thesystem correctly identifies what route was chosen and inreal-time identifies where the phone is along that route.This already serves as a cautionary note about the type ofinformation that can be leaked by a seemingly innocuoussensor like the power meter.

We note that scaling the system to cover worldwideroad segments can be done by crowd sourcing: a popularapp, or perhaps even the core OS, can record the powerprofile of streets traveled by different users and reportthe results to a central server. Over time the resultingdataset will cover a significant fraction of the world. On

2

the positive side, our work shows that service providerscan legitimately use this dataset to improve the accuracyof location services. On the negative side, tracking appscan use it to covertly locate users. Given that all that isrequired is one widespread application, many actors inthe mobile space are in a position to build the requireddataset of power profiles and use it as they will.

3 Background

In this section we provide technical background on therelation between a phone’s location and its cellular powerconsumption. We start with a description of how loca-tion is related to signal strength, then we describe howsignal strength is related to power consumption. Fi-nally, we present examples of this phenomenon, and wedemonstrate how obtaining access to power measure-ments could leak information about a phone’s location.

3.1 Location affects signal strength andpower consumption

Distance to the base station is the primary factor that de-termines a phone’s signal strength. The reason for this is,for signals propagating in free space, the signal’s powerloss is proportional to the square of the distance it travelsover [9]. Signal strength is not only determined by pathloss, it is also affected by objects in the signal path, suchas trees and buildings, that attenuate the signal. Finally,signal strength also depends on multi-path interferencecaused by objects that reflect the radio signal back to thephone through various paths having different lengths.

In wireless communication theory signal strength isoften modeled as random variation (e.g., log-normalshadowing [9]) to simulate many different environ-ments1. However, in one location signal strength can befairly consistent as base stations, attenuators, and reflec-tors are mostly stationary.

A phone’s received signal strength to its base sta-tion affects its cellular modem power consumption.Namely, phone cellular modems consume less instanta-neous power when transmitting and receiving at high sig-nal strength compared to low signal strength. Schulmanet. al. [24] observed this phenomenon on several differ-ent cellular devices operating on different cellular proto-cols. They showed that communication at a poor signallocation can result in a device power draw that is 50%higher than at a good signal location.

The primary reason for this phenomenon is thephone’s power amplifier used for transmission which in-creases its gain as signal strength drops [9]. This effect

1Parameters of the model can be calibrated to better match a specificenvironment of interest.

200 400 600 800 1000 1200 1400 1600 1800

1

1.1

1.2

1.3

1.4

1.5

1.6

1.7

1.8

1.9

2

Time [sec]

Po

we

r [W

att

]

Device 1

Device 2

Figure 2: For two phones of the same model, power vari-ations on the same drive are similar.

also occurs when a phone is only receiving packets. Thereason for this is cellular protocols which require con-stant transmission of channel quality and acknowledg-ments to base stations.

3.2 Power usage can reveal locationThe following results from driving experiments demon-strate the potential of leaking location from power mea-surements.

We first demonstrate that signal strength in each loca-tion on a drive can be static over the course of severaldays. We collected signal strength measurements froma smartphone once, and again several days later. In Fig-ure 1 we plot the signal strength observed on these twodrives. In this figure it is apparent that (1) the segmentsof the drive where signal strength is high (green) and low(red) are in the same locations across both days, and (2)that the progression of signal strength along the drive ap-pears to be a unique irregular pattern.

Next, we demonstrate that just like signal strength,power measurements of a smartphone, while it commu-nicates, can reveal a stable, unique pattern for a partic-ular drive. Unlike signal strength, power measurementsare less likely to be stable across drives because powerdepends on how the cellular modem reacts to changingsignal strength: a small difference in signal strength be-tween two drives may put the cellular modem in a modethat has a large difference in power consumption. For ex-ample, a small difference in signal strength may cause aphone to hand-off to a different cellular base station andstay attached to it for some time (Section 3.3).

Figure 2 shows power measurements for two Nexus 4phones in the same vehicle, transmitting packets overtheir cellular link, while driving on the same path. The

3

Figure 1: Signal strength profiles measured on two different days are stable(the maps were blurred for blind review, the final version will contain the full details)

200 400 600 800 1000 1200

−1

−0.5

0

0.5

1

1.5

2

2.5

3

Time [sec]

No

rma

lize

d p

ow

er

Nexus 4

Nexus5

Figure 3: For two different phone models, power varia-tions on the same drive are similar.

power consumption variations of the Nexus 4 phonesare similar, indicating that power measurements can bemostly stable across devices.

Finally, we demonstrate that power measurementscould be stable across different models of smartphones.This stability would allow an attacker to obtain a ref-erence power measurement for a drive without usingthe same phone as the victim’s. We recorded powermeasurements, while transmitting packets over cellular,using two different smartphone models (Nexus 4 andNexus 5) during the same ride, and we aligned the powersamples, according to absolute time.

The results presented in Figure 3 indicate that there issimilarity between different models that could allow onemodel to be used as a reference for another. This ex-periment serves as a proof of concept: we leave furtherevaluation of such an attack scenario, where the attackerand victim use different phone models, to future work. Inthis paper, we assume that the attacker can obtain refer-

ence power measurements using the same phone modelas the victim.

3.3 HysteresisA phone attaches to the base station having the strongestsignal. Therefore, one might expect that the base stationto which a phone is attached and the signal strength willbe the same in one location. Nonetheless, it is shownin [24] that signal strength can be significantly differentat a location based on how the device arrived there, forexample, the direction of arrival. This is due to the hys-teresis algorithm used to decide when to hand-off to anew base station. A phone hands-off from its base sta-tion only when its received signal strength dips belowthe signal strength from the next base station by morethan a given threshold [21]. Thus, two phones that residein the same location can be attached to two different basestations.

Hysteresis has two implications for determining a vic-tim’s location from power measurements: (1) an attackercan only use the same direction of travel as a referencepower measurement, and (2) it will complicate inferringnew routes from power measurements collected from in-dividual road segments (Section 6).

3.4 Background summary and challengesThe initial measurements in this section suggest that thepower consumed by the cellular radio is a side chan-nel that leaks information about the location of a smart-phone. However, there are four significant challengesthat must be overcome to infer location from the powermeter. First, during the pre-measurement phase the at-tacker may have traveled at a different speed and en-countered different stops than the target phone. Second,the attacker will have to identify the target’s power pro-file from among many pre-collected power profiles alongdifferent routes. Third, once the attacker determines the

4

target’s path, the exact location of the target on the pathmay be ambiguous because of similarities in the path’spower profile. Finally, the target may travel along apath that the attacker only partially covered during thepre-measurement phase: the attacker may have only pre-collected measurements for a subset of segments in thetarget’s route. In the following sections we describe tech-niques that address each of these challenges and experi-ment with their accuracy.

4 Route distinguishability

As a warm-up we show how the phone’s power profilecan be used to identify what route the user is taking fromamong a small set of possible routes (say, 30 routes). Al-though we view it as a warm-up, building towards ourmain results, route distinguishability is still quite useful.For example, if the attacker is familiar with the user’sroutine then the attacker can pre-measure all the user’snormal routes and then repeatedly locate the user amongthose routes.

Route distinguishability is a classification problem:we collected power profiles associated with knownroutes and want to classify new samples based on thistraining set. We treat each power profile as a time serieswhich needs to be compared to other time series. A scoreis assigned after each comparison, and based on thesescores we select the most likely matching route. Becausedifferent rides along the same route can vary in speedat different locations along the ride, and because routeshaving the same label can vary slightly at certain points(especially before getting to a highway and after exitingit), we need to compare profile features that can vary intime and length and allow for a certain amount of differ-ence. We also have to compensate for different baselinesin power consumption due to constant components thatdepend on the running applications and on differences indevice models.

We use a classification method based on DynamicTime Warping (DTW) [18], an algorithm for measur-ing similarity between temporal sequences that are mis-aligned and vary in time or speed. We compute the DTWdistance2 between the new power profile and all refer-ence profiles associated with known routes, selecting theknown route that yields the minimal distance. More for-mally, if the reference profiles are given by sequences{X}n

i=1, and the unclassified profile is given by sequenceY , we choose the route i such that

i = argmini

DTW(Y,Xi)

2In fact we compute a normalized DTW distance, as we have tocompensate for difference in lengths of different routes - a longer routemight yield larger DTW distance despite being more similar to thetested sequence.

which is equivalent to 1-NN classification given DTWmetric.

Because the profiles might have different baselinesand variability, we perform the following normalizationfor each profile prior to computing the DTW distance:we calculate the mean and subtract it, and divide the re-sult by the standard deviation. We also apply some pre-processing in the form of smoothing the profiles usinga moving average (MA) filter in order to reduce noiseand obtain the general power consumption trend, and wedownsample by a factor of 10 to reduce computationalcomplexity.

5 Real-time mobile device tracking

In this section we consider the following task: the at-tacker knows that a mobile user is traveling along a par-ticular route and our objective is to track the mobile de-vice as it is moving along the route. We do not assumea particular starting point along the route, meaning, inprobabilistic terms, that our prior on the initial locationis uniform. The attacker has reference power profiles col-lected in advance for the target route, and constantly re-ceives new power measurements from an application in-stalled on the target phone. Its goal is to locate the devicealong the route, and continue tracking it in real-time as ittravels along the route.

5.1 Tracking via Dynamic Time WarpingThis approach is similar to that of route distinguishabil-ity, but we use only the measurements collected up to thispoint, which comprise a sub-sequence of the entire routeprofile. We use the Subsequence DTW algorithm [18],rather than the classic DTW, to search a sub-sequence ina larger sequence, and return a distance measure as wellas the corresponding start and end offsets.

We search for the sequence of measurements we haveaccumulated since the beginning of the drive in all ourreference profiles and select the profile that yields theminimal DTW distance. The location estimate corre-sponds to the location associated with the end offset re-turned by the algorithm.

5.2 Improved tracking via a motion modelWhile the previous approach can make mistakes in loca-tion estimation due to a match with an incorrect location,we can further improve the estimation by imposing rulesbased on a sensible motion model. We first need to knowwhen we are “locked” on the target. For this purpose wedefine a similarity threshold so that if the minimal DTWdistance is above this threshold, we are in a locked state.Once we are locked on the target, we perform a simple

5

sanity check at each iteration: “Has the target displacedby more than X?”

If the sanity check does not pass we consider the es-timation unlikely to be accurate, and simply output theprevious estimate as the new estimate location. If thesimilarity is below the threshold, we switch to an un-locked state, and stop performing this sanity check untilwe are “locked” again. Algorithm 1 presents this logicas pseudocode.

Algorithm 1 Improved tracking using a simple motionmodel

locked← f alse . Are we locked on the target?while target moving do

loc[i],score← estimateLocation()d← getDistance(loc[i], loc[i−1])if locked and d > MAX DISP then

loc[i]← loc[i−1] . Reuse previous estimateend ifif score > T HRESHOLD then

locked← trueend if

end while

6 Inference of new routes

In Section 4 we addressed the problem of identifyingthe route traversed by the phone, assuming the poten-tial routes are known in advance. This assumption al-lowed us to train our algorithm specifically for the po-tential routes. As previously mentioned, there are indeedmany real-world scenarios where it is applicable. Nev-ertheless, in this section we set out to tackle a broadertracking problem, where the future potential routes arenot explicitly known. Here we specifically aim to iden-tify the final location of the phone after it traversed anunknown route. We assume that the area in which themobile device owner moves is known, however the num-ber of all possible routes in that area may be too large topractically pre-record each one. Such an area can be, forinstance, a university campus, a neighborhood, a smalltown or a highway network.

We address this problem by pre-recording the powerprofiles of all the road segments within the given area.Each possible route a mobile device may take is a con-catenation of some subset of these road segments. Givena power profile of the tracked device, we will reconstructthe unknown route using the reference power profilescorresponding to the road segments. The reconstructedroute will enable us to estimate the phone’s final loca-tion. Note that, due to the hysteresis of hand-offs be-tween cellular base stations, a power consumption is not

only dependent on the traveled road segment, but also onthe previous road segment the device came from.

In Appendix A we formalize this problem as a hid-den Markov model (HMM) [22]. In the following wedescribe a method to solve the problem using a particlefilter. The performance of the algorithm will be exam-ined in the next section.

6.1 Particle Filter

A particle filter [1] is a method that estimates the stateof a HMM at each step based on observations up to thatstep. The estimation is done using a Monte Carlo approx-imation where a set of samples (particles) is generated ateach step that approximate the probability distribution ofthe states at the corresponding steps. A comprehensiveintroduction to particle filters and their relation to gen-eral state-space models is provided in [23].

We implement the particle filter as follows. We denoteOr =

{or

xyz}

, where orxyz is a power profile prerecorded

over segment (y,z) while the segment (x,y) had been tra-versed just before it. We use a discrete time resolutionτ = 3 seconds. We denote ∆

yzmin and ∆

yzmax to be the mini-

mum and maximum time durations to traverse road seg-ment (y,z), respectively. We assume these bounds can bederived from prerecordings of the segments. At each it-eration i we have a sample set of N routes Pi = {(Q,T )}.The initial set of routes P0 are chosen according to Π. Ateach step, we execute the following algorithm:

Algorithm 2 Particle filter for new routes estimationfor all route p in P do

tend← end time of p(x,y)← last segment of pz← next intersection to traverse (distributed by A)

Wp← mint∈[∆yz

min,∆yzmax]

orxyz∈Or

xyz

{DTW(O[tend,tend+t],o

rxyz)

}p← p||(y,z)Update the end time of p

end forResample P according to the weights Wp

At each iteration, we append a new segment, chosenaccording to the prior A, to each possible route (repre-sented by a particle). Then, the traversal time of the newsegment is chosen so that it will have a minimal DTWdistance to the respective time interval of the trackedpower profile. We take this minimal distance as theweight of the new route. After normalizing the weightsof all routes, a resampling phase takes place. N routesare chosen from the existing set of routes according to

6

the particle weights distribution3. The new resampled setof routes is the input to the next iteration of the particlefilter. The total number of iterations should not exceed anupper bound on the number of segments that the trackeddevice can traverse. Note however that a route may ex-haust the examined power profile before the last iteration(namely, the end time of that route reached tmax). In sucha case we do not update the route in all subsequent itera-tions (this case is not described in Algorithm 2 to facili-tate fluency of exposition).

Before calculating the DTW distance of a pair ofpower profiles the profiles are preprocessed to removeas much noise as possible. We first normalize the powerprofile by subtracting its mean and dividing by the stan-dard deviation of all values included in that profile. Then,we zeroed out all power values below a threshold per-centile. This last step allowed us to focus only on thepeaks in power consumption where the radio’s powerconsumption is dominant while ignoring the lower powervalues for which the radio’s power has a lesser effect.The percentile threshold we use in this paper is 90%.

Upon its completion, the particle filter outputs a setof N routes of various lengths. To select the best esti-mate route the simple approach is to choose the routethat appears the most number of times in the output setas it has the highest probability to occur. Nonetheless,since a route is composed of multiple segments chosenat separate steps, at each step the weight of a route is de-termined solely based on the last segment added to theroute. Therefore, the output route set is biased in favorof routes ending with segments that were given higherweights, while the weights of the initial segments havea diminishing effect on the route distribution with everynew iteration. To counter this bias, we choose another es-timate route using a procedure we call iterative majorityvote described is Appendix B.

7 Experiments

7.1 Data collection

Our experiments required collecting real power con-sumption data from smartphone devices along differentroutes. We developed the PowerSpy android applica-tion4 that collects various measurements including signalstrength, voltage, current, GPS coordinates, temperature,state of discharge (battery level) and cell identifier. Therecordings were performed using Nexus 4, Nexus 5 andHTC mobile devices.

3Note that the resampling of the new routes can have repetitions.Namely, the same route can be chosen more than one time

4To be released on the Google Play market as open source afterunblinding.

50 100 150 200

1.4

1.6

1.8

2

2.2

2.4

2.6

2.8

Time [sec]

Pow

er

[Watt]

Figure 4: Power profile with a phone call occurring be-tween 50-90 seconds. Profile region during phone call ismarked in red.

7.2 Assumptions and limitations

Exploring the limits of our attack, i.e. establishing theminimal necessary conditions for it to work, is beyondour resources. For this reason, we state the assumptionson which we rely in our methods.

We assume there is enough variability in power con-sumption along a route to exhibit unique features. Lackof variability may be due to high density of cellular an-tennas that flatten the signal strength profile. We alsoassume that enough communication is occurring for thesignal strength to have an effect on power consumption.This is a reasonable assumption, since background syn-chronization of data happens frequently in smartphonedevices. Moreover, the driver might be using navigationsoftware or streaming music. However, at this stage, itis difficult to determine how inconsistent phone usageacross different rides will affect our attacks.

Identifying which route the user took involves un-derstanding which power measurements collected fromher mobile device occur during driving activity. Herewe simply assume that we can identify driving activity.Other works (e.g., [17]) address this question by usingdata from other sensors that require no permissions, suchas gyroscopes and accelerometers.

Some events that occur while driving, such as an in-coming phone call, can have a significant effect on powerconsumption. Figure 4 shows the power profile of adevice at rest when a phone call takes place (the partmarked in red). The peak immediately after the phonecall is caused by using the phone to terminate the phonecall and turn off the display. We can see that this eventappears prominently in the power profile and can copewith such transient effects by identifying and truncatingpeaks that stand out in the profile. In addition, smooth-ing the profile by a moving average should mitigate thesetransient effects.

7

7.3 Route distinguishability

To evaluate the algorithm for distinguishing routes (sec-tion 4) we recorded reference profiles for multiple differ-ent routes. The profiles include measurements from bothNexus 4 and Nexus 5 models. In total we had a datasetof 294 profiles, representing 36 unique routes. Drivingin different directions along the same roads (from pointA to B vs. from point B to A) is considered two differ-ent routes. We perform cross validation using multipleiterations (100 iterations), each time using a random por-tion of the profiles as a training set, and requiring equalnumber of samples for each possible class. The sizes ofthe training and test sets depend on how many referenceroutes per profile we require each time. Naturally, themore reference profiles we have, the higher the identifi-cation rate.

One evaluation round included 29 unique routes, withonly 1 reference profile per route in the training set, and211 test routes. It resulted in correct identification rateof 40%. That is compared to the random guess prob-ability of only 3%. Another round included 25 uniqueroutes, with 2 reference profiles per route in the trainingset and 182 routes in the test set, and resulted in cor-rect identification rate of 53% (compared to the randomguess probability of only 4%). Having 5 reference pro-files per route (for 17 unique routes) raises the identifi-cation rate to 71%, compared to the random guess prob-ability of 5.8%. And finally, for 8 reference profiles perroute we get 85% correct identification. The results aresummarized in table 1.

We can see that an attacker can have a significant ad-vantage in guessing the route taken by a user.

7.4 Real-time mobile device tracking

We evaluate the algorithm for real-time mobile devicetracking (section 5) using a set of 10 training profilesand an additional test profile. The evaluation simulatesthe conditions of real-time tracking by serially feedingsamples to the algorithm as if they are received from anapplication installed on the device. We calculate the esti-mation error, i.e. the distance between the estimated co-ordinates and the true location of the mobile device ateach step of the simulation. We are interested in the con-vergence time, i.e. the number of samples it takes untilthe location estimation is close enough to the true loca-tion, as well as in the distribution of the estimation errorsgiven by a histogram of the absolute values of the dis-tances.

Figure 5 illustrates the performance of our tracking al-gorithm for one of the routes, which was about 19 kilo-meters long. At the beginning, when there are very fewpower samples, the location estimation is extremely inac-

curate, but after two minutes we lock on the true location.We obtained a precise estimate from 2 minutes up until20 minutes on the route, where our estimate slightly di-verges, due to increased velocity on a freeway segment.Around 26 minutes (in figure 5a) we have a large esti-mation error, but as we mentioned earlier, these kind oferrors are easy to prevent by imposing a simple motionmodel (section 5.2). Most of the errors are small com-pared to the length of the route: 80% of the estimationerrors are less than 1 km.

We also tested the improved tracking algorithm ex-plained in section 5.2. Figure 5b presents the estimationerror over time, and we can see that the big errors towardsthe end of the route that appeared in 5a are not present infig. 5b. Moreover, now almost 90% of the estimation er-rors are below 1 km (fig. 6).

We provide animations visualizing our results for real-time tracking at the following links5. The animations,generated using our estimations of the target’s location,depict a moving target along the route and our estimationof its location. The first one corresponds to the methoddescribed in 5.1, and the second to the one described in5.2 that uses the motion model based correction:http://somevid.com/WOrAmNkd8KIjGv892veL

http://somevid.com/L65Gxe0edBTXY3y26Va3

7.5 Inference of new routes

7.5.1 Setup

For the evaluation of the particle filter presented in Sec-tion 6 we considered an area schematically depicted inFigure 7. The area has 13 intersections having 35 roadsegments6. The average length of a road segment is about400 meters. The average travel time over the segments isaround 70 seconds. The area is located in the center ofa medium-sized city having a population density com-parable to Philadelphia or Miami. Traffic congestion inthis area varies across segments and time of day. Foreach power recording, the track traversed at least onecongested segment. Most of the 13 intersections havetraffic lights, and about a quarter of the road segmentspass through them.

We had three pre-recording sessions which in totalcovered all segments. Each road segment was enteredfrom every possible direction to account for the hystere-sis effects. The pre-recording sessions were done usingthe same Nexus 4 phone.

We set the following parameters of the HMM (as theyare defined in Appendix A):

5We link temporarily to an anonymous video upload service. Thelinks will be changed to the project website in the final version

6Three of the segments are one way streets as depicted in Figure 7.

8

# Unique Routes # Ref. Profiles/Route # Test Routes Correct Identification % Random Guess %8 10 55 85 13

17 5 119 71 617 4 136 68 621 3 157 61 525 2 182 53 429 1 211 40 3

Table 1: Route distinguishability evaluation results. First column indicates the number of unique routes in the trainingset. Second column indicates the number of training samples per route at the attacker’s disposal. Number of test routesindicates the number of power profiles the attacker is trying to classify. Correct identification percentage indicates thepercentage of correctly identified routes as a fraction of the third column (test set size), which could be then comparedto the expected success of random guessing in the last column.

0 200 400 600 800 1000 1200 1400 1600 1800

Time [sec]

0

2000

4000

6000

8000

10000

12000

14000

16000

Error [m

eters]

(a) Convergence to true location.

0 20 40 60 80 100 120 140 160 180

Time [sec]

0

2000

4000

6000

8000

10000

12000

14000

16000

Error [m

eters]

(b) Location estimation error for improved trackingalgorithm.

Figure 5: Location estimation error for online tracking.

0 2000 4000 6000 8000 10000 12000 14000 16000

Error [meters]

0

100

200

300

400

500

600

700

800

Error Cou

nt

(a) Errors histogram. Almost 90% of the errors areless than 1 km.

0 2000 4000 6000 8000 10000 12000 14000 16000

Error [meters]

0.0

0.2

0.4

0.6

0.8

1.0

(b) Error cumulative distribution.

Figure 6: Estimation errors distribution for motion-model tracking.

9

2

3

4

5

9

11

12

13

10

7

6

8

1

Figure 7: Area for route inference. It is depictedschematically to serve the anonymity of this submission.The area in full detail will be included in the final ver-sion.

1. A – This set defines the transition probabil-ities between the road segments. We setthese probabilities to be uniformly distributedover all possible transitions. Namely, axyz ={

1/|Iy| |Iy = {w|(y,w) ∈ R,w 6= x}}

.

2. B – This set defines the distribution of power pro-file observations over each state. These probabili-ties depend on the road segments and their locationrelative to the nearby based stations. We do not needan explicit formulation of these probabilities to em-ploy the particle filter. The likelihood of a a powerprofile to be associated with a road segment is esti-mated by the DTW distance of the power profile toprerecorded power profiles of that segment.

3. Π – This set defines the initial state distribution.We assume that the starting intersection of thetracked device is known. This applies to scenar-ios where the tracking begins from well-known lo-cations, such as the user’s home, office, or anotherlocation the attacker knows in advance.

For testing, we used 4 phones: two Nexus 4 (differ-ent from the one used for the pre-recordings), a Nexus 5and an HTC Desire. Each phone was used to record thepower profile of a different route. The four routes com-bined cover almost all of the road segments in the area.Table 2 details the routes. These route recordings weredone on different days, different time of day and varyingweather conditions.

As noted, we can only measure the aggregate powerconsumption which can be significantly affected by ap-plications that continuously run. To have a better sense ofthe affects of these applications the phones were run withdifferent number of background applications. Nexus 4

Phone TrackNexus 4 #1 8-5-6-7-1-2-3-4-5-6-4-3-2-1-7-8Nexus 4 #2 7-1-2-3-4-5-8-7-6-5-4-2-1-7-8

Nexus 5 3-2-4-9-10-12-11-9-4-5-6-4-3-2-1-7-6-5-8-7HTC Desire 10-12-11-9-4-2-1-7-6-5-8

Table 2: Test Routes

#1, Nexus 5 and HTC Desire have a relatively modestnumber of applications which included (beyond the de-fault Android apps): Email (corporate account), Gmail,and Google Calender. Nexus 4 #2 has a much highernumber of application which included on top of the ap-plications of phone #1: Facebook, Twitter, Skype, Waze,and WhatsApp. All those applications periodically sendand receive traffic.

For each of the four tracks we derived all possible sub-tracks having 3 to 7 road segments. We estimated eachsuch sub-track. In total we estimated around 200 sub-tracks. For each sub-track we employed Algorithms 2and 3 to get two best estimates for the sub-track.

Tables 3 to 5 summarize the results of the route esti-mations for each of the four phones. For each route wehave two alternatives for estimated route (1) the most fre-quent route in the particle set as output by Algorithm 2;(2) the route output by Algorithm 3. For each alternativewe note the road segment in which the phone is estimatedto be after the completion of its track and compare it withthe final road segment of the true route. This allows us tomeasure the accuracy of the algorithm for estimating thelocation of the user’s destination (the end of the track).This is the most important metric for many attack sce-narios where the attacker wishes to learn the destinationof the victim.

In some cases it may also be beneficial for the attackerto know the actual route through which the victim tra-versed on his way to the destination. For this purpose,we also calculate for each alternative estimate the Leven-shtein distance between it and the true route. The Leven-shtein distance is a standard metric for measuring the dif-ference between two sequences [13]. It equals the mini-mum number of updates required in order to change onesequence to the next. In this context, we treat a route asa sequence of intersections. The distance is normalizedby the length of the longer route of the two. This allowsus to measure the accuracy of the algorithm for estimat-ing the full track the user traversed. For each estimatewe also note whether it is an exact fit with the true route(i.e., zero distance). The percentage of successful local-ization of destination, average Levenshtein distance andpercentage of exact full route fits are calculated for eachtype of estimated route. We also calculate these metricsfor both estimates combined while taking into accountfor each track the best of the two estimates. To bench-

10

random frequent Alg. 3 combinedNexus 4 #1 33% 65% 48% 80%Nexus 4 #2 31% 48% 56% 72%

Nexus 5 20% 33% 32% 55%HTC Desire 22% 40% 41% 65%

Table 3: Destination localization

mark the results we note in each table the performanceof a random estimation algorithm which simply outputsa random, albeit feasible, route.

The results in Table 3 show the accuracy of destinationidentification. It is evident that the performance of themost frequent route output by the particle filter is com-parable to the performance of the best estimate output byAlgorithm 3. However, their combined performance issignificantly better than either estimates alone and pre-dict more accurately the final destination of the phone.This result tells us that Algorithm 3 extracts significantamount of information from the routes output by the par-ticle filter beyond the information gleaned from the mostfrequent route.

Table 3 indicates that for Nexus 4 #1 the combinedroute estimates were able to identify the final road seg-ment for 80% of scenarios. For Nexus 4 #2 whichrun many applications the final destination estimates aresomewhat less accurate (72%). This is attributed to themore noisy measurements of the aggregate power con-sumption. The accuracy for the two models – Nexus 5and HTC Desire – are lower than the accuracy achievedfor Nexus 4. Remember that all our pre-recordings weredone using a Nexus 4. These results may indicate that thepower consumption profile of the cellular radio is depen-dent on the phone’s model. Nonetheless, for both phoneswe achieve significantly higher accuracy of destinationlocalization (55% and 65%) as compared to the randomcase (about 20%).

Tables 4 and 5 present measures – Levenshtein dis-tance and exact full route fit – of the accuracy of esti-mates for the full route the phone took to its destination.Here, again, the algorithm presented for Nexus 4 #1 su-perior performance. It was able to exactly estimate 45%of the full route to the destination. On the other hand, forthe more busy Nexus 4 #2 and the other model phonesthe performance was worse. It is evident from the re-sults that for these three phones the algorithm had diffi-culties producing an accurate estimate of the full route.Nonetheless, in all cases the accuracy is always markedlyhigher than that of the random case.

To have a better sense of the distance metric usedto evaluate the quality of the estimated routes Figure 8depicts three cases of estimation errors and their corre-sponding distance values in increasing order. It can beseen that even estimation error having relatively high dis-

random frequent Alg. 3 combinedNexus 4 #1 0.61 0.38 0.27 0.24Nexus 4 #2 0.63 0.61 0.59 0.52

Nexus 5 0.68 0.6 0.55 0.45HTC Desire 0.65 0.59 0.5 0.45

Table 4: Levenshtein distance

random frequent Alg. 3 combinedNexus 4 #1 4% 38% 22% 45%Nexus 4 #2 5% 8.5% 5% 15%

Nexus 5 3% 15% 9% 20%HTC Desire 5% 10% 12% 17%

Table 5: Exact full route fit

tances can have a significant amount of information re-garding the true route.

8 Future directions

In this section we discuss ideas for further research, im-provements, and additions to our method.

8.1 Power consumption inferenceWhile new (yet very common) smartphone models con-tain an internal ampere-meter and provide access to cur-rent data, other models (for instance Galaxy S III) sup-ply voltage but not current measurements. Therefore onthese models we cannot directly calculate the power con-sumption. V-edge [26] proposes using voltage dynamicsto model a mobile device’s power consumption. That andany other similar technique would extend our methodand make it applicable to additional smartphone models.

Ref. [27] presents PowerTutor, an application that es-timates power consumption by different components ofthe smartphone device based on voltage and state of dis-charge measurements. Isolating the power consumedby the cellular connectivity will improve our method byeliminating the noise introduced by other componentssuch as audio/Bluetooth/WiFi etc. that do not directlydepend on the route.

8.2 State of Discharge (SOD)The time derivative of the State-of-Discharge (the bat-tery level) is basically a very coarse indicator of powerconsumption. While it seemed to be too inaccurate forour purpose, there is a chance that extracting better fea-tures from it or having few possible routes may ren-der distinguishing routes based on SOD profiles feasi-ble. Putting it to the test is even more interesting giventhe HTML 5 Battery API that enables obtaining certain

11

2 3

4

57

6

8

1

(a) Distance = 0.125

2 3

4

57

6

8

1

(b) Distance = 0.25

2 3

4

57

6

8

1

(c) Distance = 0.43

Figure 8: Examples of estimation errors and their corresponding distances (partial graph is depicted). The true routeis green and the estimated route is red. Schematic graphs are used to serve anonymity of this submission. Plots withactual coordinates on top of real map will be included in the final version.

battery statistics from a web-page via JavaScript. Ourfindings demonstrate how future increases in the sam-pling resolution of the battery stats may turn this APIeven more dangerous, allowing web-based attacks.

8.3 Choice of reference routesSuccessful classification depends among other factorson good matching between the power profile we wantto classify and the reference power profiles. Optimalmatching might be a matter of month, time of day, trafficon the road, and more. We can possibly improve our clas-sification if we tag the reference profiles with those asso-ciated conditions and select reference profiles matchingthe current conditions when trying to distinguish a route.That of course requires collecting many more referenceprofiles.

8.4 Collecting a massive datasetCollecting a massive dataset of power profiles associatedwith GPS coordinates is a feasible task given vendors’capability to legally collect analytics about users’ use oftheir smartphones. Obtaining such big dataset will en-able us to better understand how well our approach canscale and whether it can be used with much less priorknowledge about the users.

9 Defenses

9.1 Non-defensesOne might think that by adding noise or limiting the sam-pling rate or the resolution of the voltage and currentmeasurements one could protect location privacy. How-ever, our method does not rely on high sampling fre-

quency or resolution. In fact, our method works wellwith profiles much coarser than what we can directly getfrom the raw power data, and for the route distinguish-ing task we actually performed smoothing and downsam-pling of the data yet obtained good results. Our methodalso works well with signal strength, which is providedwith much lower resoltion and sampling frequency7.

9.2 Risky combination of power data andnetwork access

One way of reporting voltage and current measurementsto the attacker is via a network connection to the at-tacker’s server. Warning the user of this risky combi-nation may somewhat raise the bar for this attack. Thereare of course other ways to leak this information. Forinstance, a malicious application disguised as a diagnos-tic software can access power data and log it to a file,without attempting to make a network connection, whileanother, seemingly unrelated, application reads the datafrom that file and sends it over the network.

9.3 Secure hardware designThe problem with access to total power consumption isthat it leaks the power consumed by the transceiver cir-cuitry and communication related tasks that indicate sig-nal strength. While power measurements can be usefulfor profiling applications, in many cases, examining thepower consumed by the processors executing the soft-ware logic might be enough. We therefore suggest that

7In fact, since it reflects more directly the environmental conditions,signal strength data can provide even better route identification andtracking. We did not focus on signal strength since accessing it re-quires access permissions and has already drawn research attention toit as useful for localization.

12

supplying only measurements of the power consumed bythe processors (excluding the power consumed by theTX/RX chain) could be a reasonable trade-off betweenfunctionality and privacy.

9.4 Requiring superuser privilegesA simple yet effective prevention may be requiring su-peruser privileges (or being root) to access power supplydata on the phone. Thus, developers and power-userscan install diagnostic software or run a version of theirapplication that collects power data on a rooted phone,whereas the release version of the software excludes thisfunctionality. This would of course prevent the collectionof anonymous performance statistics from the install-base, but as we have shown, such data can indicate muchmore than performance.

9.5 Power consumption as a coarse loca-tion indicator

Same as the cell identifier is defined as a coarse locationindicator, and requires appropriate permissions to be ac-cessed, power consumption data can also be defined asone. The user will then be aware, when installing appli-cations that access voltage and current data, of the ap-plication’s potential capabilities, and the risk potentiallyposed to her privacy.

This defense may actually be the most consistent withthe current security policies of smartphone operating sys-tems like Android and iOS, and their current permissionschemes.

10 Related work

Power analysis is a powerful side-channel. The mostwell-known example is the recovery of encryption keysfrom a cryptographic system [11].

Prior work established the relationship between signalstrength and power consumption in smartphones [4, 24].Further, Bartendr [24] demonstrated that paths of signalstrength measurements are stable across several drives.

Geolocation Techniques Gentile et al. [8] discuss GSMtechniques for localization of a mobile device. Some usefingerprinting where a pre-recorded radio map of the areaof interest is leveraged to infer location via best fit. Otheruse signal strength to calculate distances to base stationsat known locations. All these methods [12, 19, 20, 25]require signal strength information and base station IDor WiFi network name (SSID), which is protected onAndroid and iOS. Our work does not rely on the signalstrength, cell ID, or SSID. We only use power consump-tion which is unprotected on Android.

Abusing smartphone sensors An emerging line of workshows that phone sensors can be used in unexpectedways that can lead to unintended consequenes.

SurroundSense [2] demonstrates how ambient soundand light can be used for mobile phone localization, andalthough it focuses on legitimate use-cases, the samemethods could be leveraged for breaching privacy.

AccelPrint [7] is an attempt to fingerprint smartphonesby tracking imperfections in their accelerometer mea-surements. Fingerprinting of mobile devices by the char-actersitics of their loudspeakers is proposed in [5, 6].Lukas et. al. [15] proposed a method for digital cam-era fingerprinting by pattern noise present in the im-ages. [14] enhances the method enabling identificationof not only the model but particular cameras. Applied tosmartphones it could give away a particular mobile de-vice. Bojinov et. al. [3] showed that various sensors onsmartphones can be used to identify a mobile device byits unique hardware characteristics.

The Gyrophone study [16] showed that gyroscopes onsmartphones can be used for eavesdropping on a con-versation in the vicinity of the phone and identifying thespeakers.

This line of research suggests that providing applica-tions with unrestriced access to sensors, can potentiallyresult in a security breach and compromise sensitive in-formation. In this paper we show that access to thephone’s power meter can similarly leak sensitive infor-mation such as the phone’s location.

11 Conclusion

We showed that applications that read the phone’sampere-meter can gain information about the location ofa mobile device without accessing the GPS or any othercoarse location indicators. Our approach enables knownroute identification, real-time tracking, and identificationof a new route by only analyzing the phone’s power con-sumption. We experimented with our approach on real-world data collected from popular smartphones that havea significant mobile market share, and demonstrated theireffectiveness. We believe that with more data our ap-proach can be made more accurate and reveal more in-formation about the phone’s location.

Our work is an example of the unintended conse-quences that result from giving 3rd party applications ac-cess to sensors. It suggests that even seemingly benignsensors need to be protected by permissions, or at thevery least, that more security modeling needs to be donebefore giving 3rd party applications access to sensors.

References[1] ARULAMPALAM, M. S., MASKELL, S., GORDON, N., AND

13

CLAPP, T. A tutorial on particle filters for online nonlinear/non-gaussian bayesian tracking. Signal Processing, IEEE Transac-tions on 50, 2 (2002), 174–188.

[2] AZIZYAN, M., CONSTANDACHE, I., AND ROY CHOUDHURY,R. Surroundsense: mobile phone localization via ambience fin-gerprinting. In Proceedings of the 15th annual internationalconference on Mobile computing and networking (2009), ACM,pp. 261–272.

[3] BOJINOV, H., MICHALEVSKY, Y., NAKIBLY, G., AND BONEH,D. Mobile device identification via sensor fingerprinting. arXivpreprint arXiv:1408.1416 (2014).

[4] CARROLL, A., AND HEISER, G. An analysis of power consump-tion in a smartphone. In USENIX Annual Technical Conference(2010).

[5] CLARKSON, W. B., AND FELTEN, E. W. Breaking assump-tions: distinguishing between seemingly identical items usingcheap sensors. Tech. rep., Princeton University, 2012.

[6] DAS, A., AND BORISOV, N. Poster: Fingerprinting smartphonesthrough speaker. In Poster at the IEEE Security and Privacy Sym-posium (2014).

[7] DEY, S., ROY, N., XU, W., CHOUDHURY, R. R., ANDNELAKUDITI, S. Accelprint: Imperfections of accelerometersmake smartphones trackable. In Proceedings of the Network andDistributed System Security Symposium (NDSS) (2014).

[8] GENTILE, C., ALSINDI, N., RAULEFS, R., AND TEOLIS, C.Geolocation Techniques. Springer New York, New York, NY,2013.

[9] GOLDSMITH, A. Wireless communications. Cambridge univer-sity press, 2005.

[10] HUANG, J., QIAN, F., GERBER, A., MAO, Z. M., SEN, S.,AND SPATSCHECK, O. A close examination of performance andpower characteristics of 4G LTE networks. In MobiSys (2012).

[11] KOCHER, P., JAFFE, J., AND JUN, B. Differential power analy-sis. In Advances in Cryptology – CRYPTO’99 (1999), Springer,pp. 388–397.

[12] KRUMM, J., AND HORVITZ, E. Locadio: Inferring motion andlocation from wi-fi signal strengths. In MobiQuitous (2004),pp. 4–13.

[13] LEVENSHTEIN, V. I. Binary codes capable of correcting dele-tions, insertions and reversals. In Soviet physics doklady (1966),vol. 10, p. 707.

[14] LI, C.-T. Source camera identification using enhanced sensorpattern noise. Information Forensics and Security, IEEE Trans-actions on 5, 2 (2010), 280–287.

[15] LUKAS, J., FRIDRICH, J., AND GOLJAN, M. Digital cameraidentification from sensor pattern noise. Information Forensicsand Security, IEEE Transactions on 1, 2 (2006), 205–214.

[16] MICHALEVSKY, Y., BONEH, D., AND NAKIBLY, G. Gyro-phone: Recognizing speech from gyroscope signals. In Proc.23rd USENIX Security Symposium (SEC14), USENIX Associa-tion (2014).

[17] MOHAN, P., PADMANABHAN, V. N. V., AND RAMJEE, R. Ner-icell: rich monitoring of road and traffic conditions using mobilesmartphones. In . . . of the 6th ACM conference on . . . (New York,New York, USA, Nov. 2008), ACM Press, p. 323.

[18] MULLER, M. Information Retrieval for Music and Motion.Springer Berlin Heidelberg, Berlin, Heidelberg, 2007.

[19] MUTHUKRISHNAN, K., VAN DER ZWAAG, B. J., ANDHAVINGA, P. Inferring motion and location using WLAN RSSI.In Mobile Entity Localization and Tracking in GPS-less Environ-nments. Springer, 2009, pp. 163–182.

[20] OUYANG, R. W., WONG, A.-S., LEA, C.-T., AND ZHANG,V. Y. Received signal strength-based wireless localization viasemidefinite programming. In Global Telecommunications Con-ference, 2009. GLOBECOM 2009. IEEE (2009), IEEE, pp. 1–6.

[21] POLLINI, G. P. Trends in handover design. CommunicationsMagazine, IEEE 34, 3 (1996), 82–90.

[22] RABINER, L. A tutorial on hidden Markov models and selectedapplications in speech recognition. Proceedings of the IEEE(1989).

[23] RISTIC, B., ARULAMPALAM, S., AND GORDON, N. Beyondthe kalman filter. IEEE AEROSPACE AND ELECTRONIC SYS-TEMS MAGAZINE 19, 7 (2004), 37–38.

[24] SCHULMAN, A., NAVDA, V., AND RAMJEE, R. Bartendr:a practical approach to energy-aware cellular data scheduling.. . . conference on Mobile . . . (2010).

[25] SOHN, T., VARSHAVSKY, A., LAMARCA, A., CHEN, M. Y.,CHOUDHURY, T., SMITH, I., CONSOLVO, S., HIGHTOWER, J.,GRISWOLD, W. G., AND DE LARA, E. Mobility detection usingeveryday gsm traces. In UbiComp 2006: Ubiquitous Computing.Springer, 2006, pp. 212–224.

[26] XU, F., LIU, Y., LI, Q., AND ZHANG, Y. V-edge: fast self-constructive power modeling of smartphones based on batteryvoltage dynamics. Presented as part of the 10th USENIX . . .(2013).

[27] ZHANG, L., TIWANA, B., QIAN, Z., AND WANG, Z. Accurateonline power estimation and automatic battery behavior basedpower model generation for smartphones. Proceedings of the . . .(2010).

A Formal model of new route inference

In this section we formalize the problem of the new routeinference (Section 6) as a hidden Markov model (HMM)[22]. Let I denote the set of intersections in an area inwhich we wish to track a mobile device. A road segmentis given by an ordered pair of intersections (x,y), definedto be a continuous road between intersection x and inter-section y. We denote the set of road segments as R.

We assume that once a device starts to traverse a roadsegment it does not change the direction of its movementuntil it reaches the end of the segment. We define a statefor each road segment. We say that the tracked deviceis in state sxy if the device is currently traversing a roadsegment (x,y), where x,y ∈ I. We denote the route of thetracked device as a (Q,T ), where

Q ={

q1 = sx1x2 ,q2 = sx2x3 , ...}

T = {t1, t2, ...}

For such a route the device has traversed from xi toxi+1 during time interval [ti−1, ti] (t0 = 0, ti−1 < ti ∀i > 0).

Let A ={

axyz|∀x,y,z ∈ I}

be the state transition prob-ability distribution, where

axyz = p{

qi+1 = syz|qi = sxy}

(1)

Note that axyz = 0 if there is no road between intersec-tions x and y or no road between intersections y and z.

14

A traversal of the device over a road segment yields apower consumption profile of length equal to the dura-tion of that movement. We denote a power consumptionprofile as an observation o. Let B be the probability dis-tribution of yielding a given power profile while the de-vice traversed a given segment. Due to the hysteresis ofhand-offs between cellular base stations, this probabilitydepends on the previous segment the device traversed.Finally, let Π =

{πxy

}be the initial state distribution,

where πxy is the probability that the device initially tra-versed segment (x,y). If there is no road segment be-tween intersections x and y, then πxy = 0. In our modelwe treat this initial state as the state of the device beforethe start of the observed power profile. We need to takethis state into account due to the hysteresis effect. Notethat an HMM is characterized by A, B, and Π.

The route inference problem is defined as follows.Given an observation of a power profile O over time in-terval [0, tmax], and given a model A, B and Π, we needto find a route (Q,T ) such that p{(Q,T )|O} is maxi-mized. In the following we denote the part of O whichbegins at time t ′ and ends at time t ′′ by O[t ′,t ′′]. Note thatO = O[0,tmax]. We consider the time interval [0, tmax] ashaving a discrete resolution of τ .

B Choosing the best inferred route

Upon its completion, the particle filter described in Sec-tion 6.1 outputs a set of N routes of various lengths. Letus denote this set by Pfinal. This set exhibit an estimate ofthe distribution of routes given the power profile of thetracked device. To select the best estimate route the sim-ple approach is to choose the route that appears the mostnumber of times in Pfinal as it has the highest probabilityto occur. Nonetheless, since a route is composed of mul-tiple segments chosen at separate steps, at each step theweight of a route is determined solely based on the lastsegment added to the route. Therefore, in Pfinal there isa bias in favor of routes ending with segments that weregiven higher weights, while the weights of the initial seg-ments have a diminishing effect on the route distributionwith every new iteration.

To counter this bias, we choose another estimate routeusing a procedure we call iterative majority vote. Thisprocedure ranks the routes based on the prevalence oftheir prefixes. At each iteration i the procedure calcu-lates – Prefix[i] – a list of prefixes of length i ranked bytheir prevalence out of the all routes that has a prefix inPrefix[i-1]. Prefix[i][n] denotes the prefix of rank n. Theoperation p|| j – where p is a route and j is an intersec-tion – denotes the appendage of j to p. At each iterationi the procedure detailed in Algorithm 3. In the followingwe denote RoutePrefixed(R, p) to be the subset of routesout of the set R having p as their prefix.

Algorithm 3 Iterative majority voteI′← Iwhile not all prefixes found do

Prf← next prefix from Prefix[i].Find j ∈ I′ that maximizes

RoutePrefixed(RoutePrefixed(Pfinal,Prf),Prf|| j)if no such j is found then

I′ = Icontinue loop

end ifPrefix[i+1]← Prefix[i+1]∪{Prf|| j}I′ = I′−{ j}

end while

At each iteration i we rank the prefixes based on theranks of prefixes of the previous iteration. Namely, pre-fixes which are extensions of a shorter prefix having ahigher rank in a previous iteration will always get higherranking over prefixes which are extensions of a lowerrank prefix. At each iteration the procedure first finds themost common prefixes of length i+ 1 which start withthe most common prefix of length i found in the previousiteration and ranks them according to their prevalence.Then the procedure looks for common prefixes of lengthi+ 1 that start with the second most common prefix oflength i found in the previous iteration, and so on untilall prefixes of length i+ 1 are found. The intuition ofthis procedure is as follows. The procedure gives pref-erence to routes traversing segments that commonly tra-versed by other routes. Such segments received a highscore during the steps that they were chosen. Since wecan not pick the most common segments separately fromeach step (a continuous route probably will not emerge),we iteratively pick the most common segment out of theroutes that are prefixed with the segments that were al-ready chosen.

15

Documents

PowerSpy: Location Tracking using Mobile Device Power …gnakibly/papers/Usenix15.pdfPowerSpy: Location Tracking using Mobile Device Power Analysis Yan Michalevsky, Dan Boneh and Aaron