Ville Honkavirta Location fingerprinting methods in ...math.tut.fi/fi/wp-content/uploads/2008/11/di-ville-honkavirta.pdf · Location fingerprinting methods in wireless local area

Ville Honkavirta

Location fingerprinting methods

in wireless local area networks

Master of Science Thesis

Examiner: Professor Robert Piche (TUT)

Examiner and topic approved in the council meeting of

the Faculty of Science and Environmental Engineering

on 8th October, 2008

Preface

I started to familiarize myself with the research work during the summer of 2007

when I worked as a research assistant in the personal positioning algorithms research

group at the Department of Mathematics of Tampere University of Technology. In the

beginning of 2008 I started the research on location fingerprinting and the thesis was

written mainly during the following summer. Working in the project was very diverse

and educational. I did real measurements in the office area of the university and

located myself with the implemented algorithms which was very exciting. I learned a

lot from the different aspects of the challenging research work.

Professor Robert Piche was the examiner of this thesis. I would like to thank him

for the opportunity to work in the research group and for the valuable comments.

I would also like to thank all my co-workers at the research group and especially

M.Sc. Simo Ali-Loytty, M.Sc. Tommi Perala and Tuomo Maki-Marttunen for the

numerous helpful conversations around this topic. The special acknowledgement goes

to my fiancee Petra for the constant inspiration. I would also like to thank Nokia

Corporation who funded this work.

Tampere, 21st October 2008

Ville Honkavirta

Contents

Abstract 4

Tiivistelma 5

Abbreviations and Acronyms 7

Symbols 8

1 Introduction 12

2 Location Fingerprinting 14

2.1 The problem statement . . . . . . . . . . . . . . . . . . . . . . . . . . . 14

2.2 Radio map . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 16

2.2.1 Raw data . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 17

2.2.2 Mean . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 18

2.2.3 Mean and variance . . . . . . . . . . . . . . . . . . . . . . . . . 18

2.2.4 Histogram . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 19

2.3 Measurements of location estimation phase . . . . . . . . . . . . . . . . 19

2.4 Deterministic location estimation . . . . . . . . . . . . . . . . . . . . . 20

2.4.1 Distance measures . . . . . . . . . . . . . . . . . . . . . . . . . 21

2.4.2 K-nearest neighbor . . . . . . . . . . . . . . . . . . . . . . . . . 22

2.4.3 Weighted K-nearest neighbor . . . . . . . . . . . . . . . . . . . 23

2.4.4 RADAR localization system . . . . . . . . . . . . . . . . . . . . 24

2.5 Probabilistic location estimation . . . . . . . . . . . . . . . . . . . . . . 25

2.5.1 Histogram method . . . . . . . . . . . . . . . . . . . . . . . . . 29

2.5.2 Kernel method . . . . . . . . . . . . . . . . . . . . . . . . . . . 30

2.5.3 Parametric approximation of measurement noise . . . . . . . . . 35

2.5.4 Histogram comparison method . . . . . . . . . . . . . . . . . . . 37

2.6 Deterministic location estimation methods in the probabilistic framework 38

3 Filtering Approach 40

3.1 Filtering . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 40

2

3

3.2 State models . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 41

3.3 Bayesian filter . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 44

3.4 Kalman filter approach . . . . . . . . . . . . . . . . . . . . . . . . . . . 45

3.4.1 Best linear unbiased estimator . . . . . . . . . . . . . . . . . . . 45

3.4.2 Kalman filter . . . . . . . . . . . . . . . . . . . . . . . . . . . . 47

3.4.3 Non-linear Kalman filter . . . . . . . . . . . . . . . . . . . . . . 49

4 Implementations and Results 55

4.1 Wireless local area network and IEEE 802.11 standard . . . . . . . . . 55

4.2 Radio maps . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 56

4.3 Test data . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 58

4.4 Static location estimation algorithms . . . . . . . . . . . . . . . . . . . 60

4.4.1 Effect of parameters in algorithms . . . . . . . . . . . . . . . . . 62

4.4.2 Radio map density . . . . . . . . . . . . . . . . . . . . . . . . . 69

4.4.3 Single orientation vs varying orientation . . . . . . . . . . . . . 69

4.4.4 Calibration time . . . . . . . . . . . . . . . . . . . . . . . . . . 70

4.4.5 Number of access points in the test data . . . . . . . . . . . . . 71

4.4.6 Test data from different access points than calibration data . . . 72

4.4.7 Summary of static location estimation algorithms . . . . . . . . 74

4.5 Filtering algorithms . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 75

4.5.1 Applying state models to the static algorithms . . . . . . . . . . 75

4.5.2 Non-linear Kalman filter . . . . . . . . . . . . . . . . . . . . . . 77

5 Conclusions and Future Work 81

Bibliography 84

A Structure of implementation 87

A.1 Data structures . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 87

A.2 Importing data to Matlab . . . . . . . . . . . . . . . . . . . . . . . . . 90

A.2.1 True coordinates of calibration points and test data . . . . . . . 90

A.2.2 Importing radio maps . . . . . . . . . . . . . . . . . . . . . . . 90

A.2.3 Importing test data . . . . . . . . . . . . . . . . . . . . . . . . . 91

A.3 Parsing data for location estimators . . . . . . . . . . . . . . . . . . . . 91

A.3.1 Parsing radio map . . . . . . . . . . . . . . . . . . . . . . . . . 91

A.3.2 Parsing measurement structure . . . . . . . . . . . . . . . . . . 92

A.4 Location solvers . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 92

Abstract

TAMPERE UNIVERSITY OF TECHNOLOGY

Faculty of Science and Environmental Engineering, Department of Mathematics

Honkavirta, Ville: Location Fingerprinting Methods in Wireless Local Area

Networks

Master of Science Thesis, 86 pages and 6 pages appendices

Examiner: Professor Robert Piche

November 2008

The satellite localization is not able to provide adequate functionality in weak signal

environments, such as urban areas and indoors. Location fingerprinting benefits from

the complex signal propagation by using the unique fingerprints of the certain calibra-

tion points. In this work different methods to exploit the fingerprints in the location

estimation phase are studied by first presenting the mathematical formulation and

then by testing them in the Wireless Local Area Network (WLAN).

Static location estimation algorithms can be divided into deterministic and prob-

abilistic methods. Probabilistic methods interpret the normalized fingerprint

histograms as the distributions of the measurement. In the tests, the proba-

bilistic method with the non-parametric kernel function approximation of fingerprint

histograms was the best static location estimation algorithm. Filtering approach

includes utilization of the different state models and computations of the unknown

quantities of the Best Linear Unbiased Estimator (BLUE). The graph state model

is based on the connection information between the calibration points according to

the floor plan and it improved the results clearly compared to the static estimation

algorithms. However, the linear Kalman Filter (KF) performed even better with the

stationary state model. Non-linear Kalman filter provided better performance than

the static estimation algorithms especially with the small number of access points

(APs) and the sparse radio map.

4

Tiivistelma

TAMPEREEN TEKNILLINEN YLIOPISTO

Luonnontieteiden ja ymparistotekniikan tiedekunta, Matematiikan laitos

Honkavirta, Ville: Sormenjalkipaikannuksen menetelmat langattomissa

lahiverkoissa

Diplomityo, 86 sivua ja 6 liitesivua

Tarkastaja: Professori Robert Piche

Marraskuu 2008

Satelliittipaikannus ei pysty tarjoamaan riittavaa tarkkuutta heikon signaalikuulu-

vuuden alueilla, kuten kaupungeissa ja sisatiloissa. Langattoman tiedonsiirron ja

paatelaitteiden kehityksen myota kiinnostus erilaisten sisatilapaikannusmenetelmien

kehittamiseen on lisaantynyt voimakkaasti. Tassa tyossa langatonta lahiverkkoa (Wi-

reless Local Area Network, WLAN) on hyodynnetty mittaamalla signaalinvoimakkuus

(Received Signal Strength Indicator, RSSI) eri pisteissa ja mittauksia on kaytetty sor-

menjalkipaikannuksessa (location fingerprinting) radiokartan luomiseen.

Sormenjalkipaikannus koostuu kahdesta vaiheesta. Kalibrointivaiheessa kerataan ra-

diokartta, joka pitaa sisallaan mittaukset eli sormenjaljet kalibrointipisteissa, seka

naiden pisteiden koordinaatit. Siten sormenjalkipaikannuksessa pyritaan kaantamaan

vaikeasti ennustettava signaalin eteneminen hyodyksi muodostamalla jokaiselle kali-

brointipisteelle ominainen sormenjalki, jonka tulisi erota viereisten kalibrointipisteiden

sormenjaljista mahdollisimman paljon. Signaalinvoimakkuutta kaytetaan usein mit-

tauksena, koska se muodostaa luotettavamman sormenjaljen kuin esimerkiksi signaalin

saapumisaikaan tai -suuntaan perustuva mittaus. Paikannusvaiheessa kayttajan paik-

ka estimoidaan kayttaen hyvaksi uusia mittauksia seka kalibrointivaiheessa kerattya

radiokarttaa.

Tassa tyossa on luotu pohjaa eri paikannusvaiheen menetelmien matemaattiselle mal-

linnukselle ja formuloinnille. Perinteisesti nama menetelmat voidaan jakaa deter-

ministisiin ja probabilistisiin menetelmiin. Deterministisissa menetelmissa tavoittee-

5

TIIVISTELMA 6

na on saada yksi paikkaestimaatti, joka perustuu paikannusvaiheen mittausten ja

radiokartan sormenjalkimittausten vertaamiseen. Probabilistisessa lahestymistavassa

hyodynnetaan bayesilaista estimointiteoriaa ja lasketaan kayttajan paikalle mittauk-

silla ehdollistettu jakauma. Talloin normalisoidut sormenjalkihistogrammit tulkitaan

signaalinvoimakkuuden jakaumina, jonka perusteella voidaan formuloida empiirinen

sormenjalkiin perustuva mittausmalli. Tassa tyossa kayttajan paikka pyritaan mallin-

tamaan jatkuvana satunnaismuuttujana, joka perustuu kiinnostavan alueen jakamiseen

pienempiin, suorakulmaisiin alueisiin eli soluihin. Kunkin solun sormenjaljen muodos-

taa solun keskipisteessa eli kalibrointipisteessa tehdyt mittaukset ja paikan ehdolli-

seksi jakaumaksi muodostuu tiheysfunktio, joka on vakio kussakin solussa. Erilaisia

probabilistisia menetelmia saadaan, kun sormenjalkihistogrammeille tehdaan erilaisia

approksimaatioita. Tassa tyossa on myos pohdittu deterministista lahestymistapaa

probabilistisesta nakokulmasta.

Eri menetelmia testattiin ja verrattiin keskenaan kayttaen oikeita signaalinvoimak-

kuusmittauksia langattomassa lahiverkossa. Deterministinen K:n lahimman naapu-

rin menetelma (K-nearest neighbor, KNN) toimi parhaiten K:n arvolla yksi, jol-

loin paikkaestimaatti oli suoraan lahimman naapurin eli kalibrointipisteen paik-

ka. Tata menetelmaa kutsutaan lahimman naapurin (nearest neighbor, NN) mene-

telmaksi. Probabilistisista menetelmista sormenjalkihistogrammien ei-parametrinen

kernel-approksimaatio tuotti parhaita tuloksia.

Suodatusnakokulmassa erilaisia tilamalleja sovellettiin parantamaan staattisen paik-

karatkaisun algoritmeja. Naista tilamalleista hyvia tuloksia tuotti graafi-tilamalli, jo-

ka muodostetaan pohjapiirustuksen avulla tutkimalla solujen valisia kulkuyhteyk-

sia. Lisaksi staattista paikkaratkaisua kaytettiin mittauksena lineaarisessa Kalma-

nin suodattimessa (Kalman Filter, KF), jolloin paastiin vielakin parempiin tuloksiin

kuin graafi-tilamallilla. Tyossa sovellettiin myos epalineaarista Kalmanin suodatin-

ta sormenjalkipaikannukseen, joka perustuu tilan parhaaseen lineaariseen harhatto-

maan estimattoriin (Best Linear Unbiased Estimator, BLUE). BLUE:n tuntemattomat

suureet on tassa tyossa laskettu kayttaen normalisoituja sormenjalkihistogrammeja.

Epalineaarinen Kalmanin suodatin tuotti parempia tuloksia kuin staattiset algoritmit

varsinkin harvalla radiokartalla ja pienella maaralla tukiasemia.

Abbreviations and Acronyms

AP access point

AOA Angle Of Arrival

BLUE Best Linear Unbiased Estimator

BS base station

BSS basic service set

CP calibration point

csv comma separated values

EKF Extended Kalman Filter

IEEE the Institute of Electrical and Electronics Engineers

i.i.d. independent and identically distributed

ISM Industrial, Scientific and Medical

KF Kalman Filter

KNN K-nearest neighbor

ME mean error

MMSE minimum mean square error

MSE mean square error

MU mobile unit

NN nearest neighbor

pdf probability density function

pdml packet details markup language

RSSI received signal strength indicator

SNR signal to noise ratio

TDOA Time Difference Of Arrival

TOA Time Of Arrival

UKF Unscented Kalman Filter

WKNN weighted K-nearest neighbor

WLAN Wireless Local Area Network

WNIC Wireless Network Interface Controller

7

Symbols

∫Integral

∫

BIntegral over region B

∇ Gradient

| · | Size or cardinality of set, length of list or absolute value

p·y Diagonal matrixD−→ Convergence in distribution

APj jth AP

aij List of RSSI values measured from access point APj

aij Mean of samples from APj measured at ith CP

ai Vector of means of ith fingerprint samples from different APs

αi Weights used in computations of prior distribution in

non-linear Kalman filter

argminf(·) Argument that minimizes f(·)Bi ith grid cell

B Grid, where fingerprints are collected

b Histogram bin width

βi Characterization of posterior at cell Bi

β Brownian motion

bij jth endpoint of main diagonal of ith cell

bt tth histogram bin

di Orientation of MU during calibration at ith CP

dij Distance between ith and jth calibration points

diM(·) Mahalanobis distance computed by using

fingerprint of ith calibration point

δ(x) Dirac delta function

E(x) Expectation value of random variable x

ETR(x|y) Truncated expectation of piecewise continuous

posterior distribution

ek Mean squared error at time step tk

F Sigma algebra of event space

8

SYMBOLS 9

F State transition matrix

FS State transition matrix of stationary state model

FCV State transition matrix of constant velocity state model

Fx(x) Cumulative probability density function

f State transition function

HaijList aij presented as histogram

HNaij

List aij presented as normalized histogram

HyijList yij presented as histogram

HNyij

List yij presented as normalized histogram

HvijProbability density function of measurement noise at cell Bi

ht tth histogram bin height

hNt Normalized bin height

h Smoothing parameter of kernel function

h Known vector valued measurement function

I Identity matrix

J ji Set of histogram bin indexes corresponding cell Bi and APj

K(x) Kernel function

k Time index

kij Number of bins in histogram

χBi(x) Function, which evaluates 1 if cell Bi includes x and 0 otherwise

LnK Set of calibration points corresponding K ”nearest” neighbors

in signal space

Lf Indexes of fingerprints, which include samples from same APs

as measurement y

Mi ith element in radio map

MMi ith element in radio map, which includes mean of samples

MMVi ith element in radio map, which includes mean and variance

of samples

MHi ith element in radio map, which includes whole histogram

of samples

M Number of cells in radio map grid

Ni Set of APs at range at ith calibration point

Ny Set of APs at range during location estimation phase

NµΣ(x) Probability density function of Gaussian distribution

with covariance matrix Σ > 0 and expected value µ

Ω Event space

P Probability

PxxkCovariance matrix of state xk

PxykCovariance matrix of state xk and measurement yk

SYMBOLS 10

PyykCovariance matrix of measurement yk

Pk Covariance matrix of posterior distribution

P−k Covariance matrix of prior distribution

pi Coordinates of ith CP

px(x), p(x) Probability density function of random variable x

p(x, y) Joint probability density function of random variables x and y

p(x|y) Conditional probability density function

of random variable x given y = y

p(y|i) Likelihood at cell Bi.

||·||p p-norm

QC Diffusion matrix of Brownian motion

Q Covariance matrix of state model error

QS State noise covariance matrix of stationary state model

Rn n-dimensional vector space over field of real numbers

Ri Fingerprint measured at ith CP

R Set of all fingerprints

RK Set of K ”nearest” fingerprints in signal space

R Covariance matrix of measurement noise

θi Additional parameters stored to ith element of radio map

µ Expected value of Gaussian distribution

Σ Covariance matrix of Gaussian distribution

σ2ij Sample variance of samples in aij

σ2 Variance of Gaussian distribution

σ2C Diagonal elements of diffusion matrix of Brownian motion

xt,x(t) Stochastic process of state

xt tth element of vector x

xk Expected value of state xk

xN−W Nadaraya-Watson estimator

xMAP Maximum a posterior estimator of state

xMEAN Mean estimator of state

xMMSE Minimum mean square error estimator of state

x ∼ N(µ, σ2) Random variable x has Gaussian distribution

with parameters µ and σ2

x ∼ Np(µ, Σ) Random variable x has p-dimensional Gaussian distribution

with parameters µ and Σ

x ∼ log-N(µ′, σ′2) Random variable x has log-Normal distribution

with parameters µ′ and σ′2

xML Maximum likelihood estimator

v Measurement noise

SYMBOLS 11

ωi Weights of Nadaraya-Watson estimator

V(x) Variance of random variable x

wi Weight assigned to ith CP in

weighted K-nearest neighbor method

wk State model error

yk Expected value of measurement yk

y Realized value of measurement

y Measurement received at location estimation phase

yj List of RSSI values measured from access point APj

Chapter 1

Introduction

Location-aware services have become popular with the development of modern commu-

nication technology. The increased variety of commercial applications has established

the need especially for indoor localization services. Weak signal reception and miss-

ing line-of-sight between the user and the satellites in the Global Positioning System

(GPS) have provided incentive to develop indoor and urban area localization systems.

Solutions to the indoor localization have been proposed; systems such as the Active

Badge, the Cricket and the Bat rely on infrastructure which is specially designed for

indoor localization [Hightower and Borriello 2001]. The Ekahau Positioning Engine

(EPE) system is claimed to provide accuracy up to 1 m [Ekahau]. However, these kind

of purpose built systems can be expensive and hard to implement in practice.

Localization systems can be categorized by the type of measurements they exploit.

Systems based on the Angle Of Arrival (AOA), Time Of Arrival (TOA) and the Time

Difference Of Arrival (TDOA) have been proposed [Vossiek et al. 2003]. These types

of measurements, however, encounter problems in the complex signal propagation

environments [Bahl and Padmanabhan 2000].

12

CHAPTER 1. INTRODUCTION 13

The increased deployment and the popularity of the Wireless Local Area Network

(WLAN) has opened a new opportunity to the location-aware services. Although

WLAN has not been designed for localization, the radio signal can be used for the

location estimation by exploiting the Received Signal Strength Indicator (RSSI) value.

The received signal strength also allows the utilization of the existing infrastructure,

because no additional hardware is needed.

Location fingerprinting differs from other localization principles. Instead of deter-

mining the distance between the user and the transmitting access point (AP), the

characterization of the signal propagation is determined by actually measuring the

RSSI pattern at certain locations. This provides localization even in very complex

environments, because it is not based on the signal propagation model, but to the

database of measurements.

Many of the existing location fingerprinting methods lack the proper mathematical

formulation and theoretical basis. Thus the first purpose of this work is to present

the mathematical formulation of the most popular location fingerprinting methods.

The second goal is to apply the linear and the non-linear Kalman filter to the location

fingerprinting. The third goal in this work is to compare the different algorithms using

tests with the real measurements in varying circumstances and evaluate the algorithms’

advantages and disadvantages.

This work is organized as follows. In Chapter 2 the basic concepts and methods of the

location fingerprinting are covered. The emphasis is on the mathematical formulation

and structuring the methods according to their theoretical background. The methods

covered in Chapter 2 can be divided into deterministic and probabilistic approaches,

when the individual methods are introduced in the context of these wider concepts.

In Chapter 3 the traditional location fingerprinting is extended to the computation of

the location estimate in the time series, which lead to the application of the different

filters location fingerprinting. The different state models are combined with the like-

lihoods provided by the static location estimation algorithms. The other approach is

the exploitation of the Best Linear Unbiased Estimator (BLUE), which leads to the

linear and the non-linear Kalman filter.

The design of the implementations and the data structures are presented in Chapter

4 and the used testbed and the equipments are explained. The different methods are

tested and compared and the results from the variety of tests are presented.

Chapter 5 summarizes the results of the work and suggests guidelines for future work.

Chapter 2

Location Fingerprinting

In this chapter the basic concepts and methods of location fingerprinting are con-

sidered. The emphasis is on the mathematical formulation of the concepts and the

location estimation methods. The methods cover the static case, where only the mea-

surements from the current time step are used to infer the location estimate. In

Chapter 3 the location is estimated in the time series.

2.1 The problem statement

Location fingerprinting is the method of determining the location of the mobile unit

(MU), which is a movable measurement device. Location fingerprinting involves two

phases, the off-line or the calibration phase and the on-line or the location estimation

phase. In this work the latter terms are used. In the calibration phase, the Received

Signal Strength Indicator (RSSI) from several access points (APs) is measured at

chosen locations, called calibration points (CPs). These measurements are called the

fingerprints of the CPs and they are part of the calibrated radio map, which is discussed

further in Section 2.2. In the location estimation phase the calibrated radio map, and

essentially the measured fingerprints, are used for the localization of the MU. The

basic idea of the location fingerprinting is illustrated in Figure 2.1.

The mathematical formulation of the localization problem is troublesome in many

ways. The first challenge is the modeling of the MU’s state x. In the literature the

state x often includes only the location of the MU and it is modeled as a discrete

deterministic variable, which is a consequence of the calibration at the discrete set of

locations [Ladd et al. 2002]. The location estimate of the MU is obtained from the

locations of the CPs, which leads to the discrete set of possible locations of the MU.

14

CHAPTER 2. LOCATION FINGERPRINTING 15

MU at CPs Radio map

Location estimation

algorithm

CALIBRATION

PHASE

AP1 AP2 . . . APn

MU at unknown

location

AP1 AP2 . . . APn

LOCATION ESTIMATION

PHASE

Location estimate

RSSI

RSSI

Figure 2.1: Two different phases of the location fingerprinting.

The location estimation phase does not produce the probability density function of

the state x, but a single realization of the location variable. This approach is called

the deterministic approach and it is considered in Section 2.4.

In Section 2.5 a different approach is presented. Even though the actual location

estimate can be interpolated between the CPs by using different ways to combine

the CP coordinates, it would be misleading to model the MU’s state x as a discrete

variable. This is because the MU can move around continuously during the location

estimation phase. Therefore in Section 2.5 the state x is modeled as a continuous

random variable; the objective of the location estimation phase is to obtain the whole

distribution of the state and thus this approach is called the probabilistic approach. To

obtain the continuity of the state x the fingerprints are modeled to represent also the

fingerprints in the neighborhood of the calibration point. In the location fingerprinting

the locations of the CPs are chosen according to the floor plan by dividing the area

of interest into smaller areas, with the CPs at the centers of the areas. The areas

can be for example the stairway, the office room or part of the corridor. In this work

these areas are called cells Bi and they are assumed to be rectangular and parallel to

the horizontal and vertical axis and they form a grid B. Thus the probability density

function (pdf) of the state is a piecewise continuous function being constant in the

cells.


2.2 Radio map

As discussed, location fingerprinting consists of two phases, namely the calibration

phase and the location estimation phase. The former phase includes the creation of

the radio map, which is used as a reference during the latter phase. The radio map

covers the area of interest and it holds the RSSI values of the radio signal, which are

collected as a function of the location. Signal to noise ratio (SNR) is also available,

but it is often omitted because the RSSI has a stronger correlation to the location than

the SNR [Bahl and Padmanabhan 2000]. RSSI is a measurement of the power present

in a radio signal. However, the RSSI values can vary when different wireless adapter

and software are used [Kaemarungsi 2006]. Thus it is reasonable to use the same

hardware and software during the whole process. The unit of the RSSI is arbitrary

and in the location fingerprinting methods the unit of the RSSI is not relevant, because

the distance between the MU and the AP is never evaluated. Normally the RSSI is in

the units of signal power, i.e. mW or dBm.

Design of the radio map is an important part of the location fingerprinting process and

it is affected by many factors, e.g. the accuracy demands and the floor plan. When

the objective is to locate a certain item in the shop, a more fine-grained radio map is

needed than for localization of the nearest fire-exit.

It is also possible to simulate the radio map by using the radio propagation model

[Narzullaev et al. 2008]. Li et al. [2006] as well try to diminish the time consuming

effort of the radio map creation, adding CPs by interpolating new fingerprints from

the existing measurements.

The radio map can be modified or preprocessed before applying it during the location

estimation phase. The motive for that can be the reduction of the required memory

to store the radio map or the reduction of the computational cost of the location

estimation. In addition, different location estimation methods apply different approx-

imations from the fingerprints. Thus different ways to modify the raw data for the

location estimation phase are discussed next.


2.2.1 Raw data

During the calibration phase RSSI values are measured at fixed locations for a certain

period of time and stored in the radio map. The ith element in the radio map has the

form

Mi = (bi1,bi2, aij |j ∈ Ni, θi︸︷︷︸

Ri∈R

), i = 1, . . . , M,(2.1)

where aij is the list of RSSI values measured from access point APj . The set of APs

in range at the ith calibration point is the set Ni, that means the APs which can be

heard during the calibration time. Thus the number of APs heard is the size of the list

Ni, which is denoted as |Ni|. The number of samples measured from APj is the length

of the list aij , denoted as |aij |. In this work only 2-dimensional location is considered

and thus the vectors bi1 and bi2 include the 2-dimensional coordinates of the opposite

vertices of the ith grid cell, that is, the endpoints of the main diagonal. The idea is

to store enough coordinates to restore later all the vertices of the rectangle Bi and

the coordinates of the ith calibration point pi. The center of the cell can be easily

computed, that is

pi =

[ |b1i1 − b1

i2|2

|b2i1 − b2

i2|2

]T

.

where the scalar bti1 is the tth element in the coordinate vector. Using these notations,

the cells Bi can be defined as

Bi = x | b1i1 ≤ x1 ≤ b1

i2 ∧ b2i1 ≤ x2 ≤ b2

i2,

where the x1 and x2 are the first and second coordinates of the state vector x,

which is now assumed to include only the 2-dimensional location and the vector

b1 is a vertex with the smaller vertical coordinate. Now the area of the cell is

|Bi| = |b1i1 − b1

i2| · |b2i1 − b2

i2|. The parameter θi in (2.1) contains any other informa-

tion needed in the location estimation phase. This can be for example the orientation

θi = di ∈ north,south,east,west of the MU, such as in the RADAR system [Bahl and

Padmanabhan 2000]. Variable θi can also include the time of day, the label of the ith

CP, denoted as CPi, or any other information which could be useful in the location

estimation phase. All the stored parameters at the calibration point pi form a unique

fingerprint Ri, thus the set R = R1, . . . , RM is the set of all fingerprints. In Section

3.2 the variable θi includes the connection information between the cells Bi and Bj

according to the floor plan.

Figure 2.2 illustrates an example of an element in the radio map without any additional

variables stored to θi. In the example the RSSI measurements are received from APs


1,4 and 8. Thus different fingerprints include RSSI samples from different number of

APs, which has to be considered during the location estimation phase. The example

also illustrates that the number of samples in aij can vary significantly: the stronger

the RSSI from the AP, the more samples are received.

i = 5,M5 = (b51,b52, a5j |j ∈ N5), N5 = 1, 4, 8b51 = [1.5,-18.3]T ,b52 = [12.2,-11.2]T

a51 = (-61 -60 -60 -57 -65 · · · -62)

a54 = (-91 -90 -92 -89 -90)

a58 = (-78 -77 -78 -80 -80 -81)

Figure 2.2: Example of an entry in the radio map M

2.2.2 Mean

The most common choice as a preprocessing method is to store only the mean of aij in

the radio map. This approach is used in the RADAR method [Bahl and Padmanabhan

2000], which is discussed more in Section 2.4.4. This is very convenient, because only

|Ni| numbers are needed to represent the fingerprint Ri, only one RSSI value for each

AP in range. Thus the ith element in the radio map has the form

MMi = (bi1,bi2, aij |j ∈ Ni, θi

︸︷︷︸

Ri∈R

),(2.2)

where the mean of aij is

aij =1

|aij |

|aij |∑

t=1

atij ,

where |aij | = length(aij) and atij is the tth element of the list aij . The radio map of

the form (2.2) is used in Section 2.4.

2.2.3 Mean and variance

Storing just the means of the fingerprint does not give any information about the

variation of the data. Fingerprints can be extended to store also the variance of the

RSSI samples [Kaemarungsi and Krishnamurthy 2004]. Thus the ith element in the

radio map has the form

MMVi = (bi1,bi2, aij, σ

2ij |j ∈ Ni, θi

︸︷︷︸

Ri∈R

),(2.3)


where the variance of aij is

σ2ij =

1

|aij| − 1

|aij |∑

t=1

(atij − aij)

2,

and the sample size is assumed to be |aij | > 1. The radio map of the form (2.3) is used

in Section 2.5, when the raw data is approximated with the Gaussian distribution.

2.2.4 Histogram

The lists aij of RSSI values can be presented as RSSI histograms [Roos et al. 2002].

Thus the ith element in the radio map has the form

MHi = (bi1,bi2, Haij

|j ∈ Ni, θi︸︷︷︸

Ri∈R

),(2.4)

where the histograms are defined as the set of pairs,

Haij= (bt, ht)|t ∈ J j

i , (2.5)

where bt is the histogram bin and the corresponding histogram height is ht. Bin range

or width is an interval b, which is normally equal in each histogram. The number of

bins is

kij =

⌈maxj

(aij) − minj

(aij)

b

⌉

.

Histograms Haijare used in Section 2.5 to compute the probability of the measurement

at the cell Bi.

2.3 Measurements of location estimation phase

The objective in the location estimation phase is to estimate the state x from the re-

ceived measurement y, which includes the RSSI samples from several APs. Locating

moving MU involves discreting the time into time steps and exploiting the measure-

ments during the time step. The most common choice is to use 1 s time step.

The mathematical formulation of the measurement can be done using the similar

structure with the radio map formulation, thus the measurement y has the form

y = yj|j ∈ Ny ∈ R, (2.6)


where yj is the list of RSSI values measured from access point APj and the set of

APs in range is Ny. The number of APs in range during the time step is |Ny| and the

number of samples measured from APj is the length of the list yj .

In this work two main approaches are examined, the deterministic (Section 2.4) and

the probabilistic (Section 2.5) approach. In the deterministic approach each RSSI

sample is not used separately, but the sample averages of different APs are collected

into a vector and used to estimate the MU’s location. Then the measurement can be

formulated as

y = [yN1y. . . yNn

y]T ,

where n = |Ny| and the sample average corresponding to APj is

yj =1

|yj |

|yj |∑

t=1

ytj,

where |yj | = length(yj) and ytj is the tth element of yj. In the probabilistic approach

each RSSI sample ytj is used as a measurement.

The histogram comparison method, discussed in Section 2.5.4, differs from the other

probabilistic methods. Although it exploits every sample ytj, it uses them at the same

time by collecting the samples from the APj into a histogram Hyij. The number of

histograms equals the number of APs in range during the time step.

2.4 Deterministic location estimation

In the location estimation phase the measurements are compared to the radio map

to compute the location estimate. In the deterministic location estimation [Bahl and

Padmanabhan 2000] the state x is not considered as a continuous random variable.

The objective is not to obtain the whole distribution of the state x, but to compute a

single location estimate at every time step. Deterministic location estimation is based

on the similarity of the measurement y and the fingerprints Ri.

In this section the mathematical formulation of the deterministic approach is pre-

sented. In Section 2.4.1 the different distance measures are described and they are

used in the K-nearest neighbor method (Section 2.4.2) and in the weighted K-nearest

method (Section 2.4.3). RADAR system is the first deterministic location fingerprint-

ing system and it is covered in Section 2.4.4.


2.4.1 Distance measures

Different measures are used to find the best match between observations and the radio

map. The common choice for the comparison measure is to use the p-norm to assign

a non-negative value to the fingerprint Ri [Bahl and Padmanabhan 2000].

Definition 2.1 (p-norm) Let p ≥ 1 be a real number. Then the p-norm of the n-

dimensional vector x is

‖x‖p =

(n∑

i=1

|xi|p)1/p

.

In this work the modified p-norm is used to assign a non-negative value to the finger-

print Ri, that is

||y − ai||p =

( |Ni|∑

j=1

1

wij|yj − aij |p

)1/p

, (2.7)

which is the distance in the signal strength space between the measurement y and the

fingerprint Ri computed at the location estimation phase of the algorithm. In equation

(2.7) wij are weights, which are normally assigned to wij = 1 giving a standard p-norm.

By varying parameter p one gets different norms. For example p = 1 corresponds to

Manhattan distance and p = 2 implies Euclidean norm, which is the most commonly

used distance measure in the location fingerprinting. Weights wij can be used to

balance the variations in the fingerprints. Prasithsangaree et al. [2002] assign weights

wij = |aij |−1, which equals the sample size used to compute aij . This is presented to

be the measure of reliability of Ri. Prasithsangaree et al. [2002] also use the standard

deviation σij as weights.

Other common norm is the infinity norm, which is defined next.

Definition 2.2 (∞-norm) The ∞-norm of the n-dimensional vector x is

‖x‖∞ = max (|x1|, . . . , |xn|) .

The ∞-norm is also called the maximum-norm and in this work it is used to compute

the largest difference between the measurement yj and the RSSI value aij, that is

||y − ai||∞ = max(

|yN1

i− aiN1

i|, . . . , |y

N|Ni|i

− aiN

|Ni|i

|)

,

where one assumes Ny = Ni.

The Mahalanobis distance can be also used as a distance measure.


Definition 2.3 (Mahalanobis distance) The Mahalanobis distance from the n

samples with the mean µ = [µ1, µ2, . . . , µn]T and the non-singular covariance matrix Σ

for a vector x = [x1, x2, . . . , xn]T is defined as

DM(x) =√

(x − µ)T Σ−1(x − µ).

It is a distance measure that is based on the correlations between random variables.

Thus it can be used for measuring the similarity of the unknown sample set to a known

one. The Mahalanobis distance between the measurement and the fingerprint Ri can

be computed as

diM(y − ai) =

√

(y − ai)T Σ−1i (y − ai),

where y is the vector containing 1 s averages from different APs. In this work, the

Mahalanobis distance is used by applying the sample means and the sample variances

of the fingerprints. Because samples from different APs are assumed to be mutually

independent, the covariance matrix Σi used in the computation of the distance diM(x)

is the diagonal matrix

Σi = pσ2i1, . . . , σ

2iny,

so that the corresponding Mahalanobis is a case of 2.7.

2.4.2 K-nearest neighbor

K-nearest neighbor (KNN) method is one of the simplest ways to determine the location

of the MU by using the radio map. KNN algorithm is a location fingerprinting method

that considers K CPs to calculate the approximate position of the user. The idea is

to compare the fingerprints in the radio map to the observed measurements and to

select K calibration points with the ”nearest” RSSI values.

In the KNN approach, the vector y is used as a measurement and compared to the

radio map (2.2), which includes only the sample averages. Let the list

L2K = p1, . . . ,pK

be the list of calibration point coordinates corresponding the list of K fingerprints

a1:K = a1, . . . , aK,

which satisfy

d(y − ai) ≤ d(y, aj),


where ai ∈ a1:K , aj /∈ a1:K and the function d(·) is a chosen distance measure discussed

in Section 2.4.1. The Euclidean norm is widely used, but the Manhattan norm is also

common. The most common choice as a MU’s location estimator x is the average of

the coordinates of the K ”nearest” fingerprints, that is

x =1

K

K∑

i=1

pi, pi ∈ L1:K . (2.8)

The estimator (2.8) is a very restricted approach to compute the location estimate,

because the number of possible estimates is always finite and is a function of the

number of CPs.

Saha et al. [2003] preprocess the raw data radio map to obtain both the mean ai and

the standard deviation σij of the RSSI samples, which leads to the modified radio map

(2.3). The location estimation is done by using the value K = 1, which leads to the

nearest neighbor (NN) method. The Euclidean norm is used as a distance measure,

but the estimate is rejected if

|yj − aij | > 2σij,

where CPi is the ”nearest” calibration point.

2.4.3 Weighted K-nearest neighbor

There are lot of modifications to the KNN approach in the literature. One approach is

to calculate the location of the MU as a weighted average of the fingerprint locations,

that is

x =1

∑Kj=1 wj

K∑

i=1

wipKi , pK

i ∈ LnK . (2.9)

Li et al. [2006] use the inverse of the RSSI distance as a weight, that is

wi = d(y, ai)−1, (2.10)

where

ai = [aiN1

i, . . . , a

iN|Ni|i

]T ,

and N ti is the tth element of the list Ni. Li et al. [2006] assert that in general the KNN

and the weighted K-nearest neighbor (WKNN) can achieve better accuracy than NN

method, particularly with parameter values K = 3 and K = 4. However, if the density

of the radio map is high, NN method can perform as well as the more complicated

methods [Li et al. 2006].


2.4.4 RADAR localization system

The first wide WLAN -based system for locating and tracking users inside buildings

was RADAR [Bahl and Padmanabhan 2000], developed by Microsoft researchers. This

system can be considered as a fundamental research of the location fingerprinting

methods and it will be discussed next. Bahl and Padmanabhan [2000] cover all the

basic deterministic approaches of location fingerprinting by using RADAR system,

thus it can be used as a good reference when compared to the other modifications.

Location fingerprinting can be done in two directions. In the first case, MU receives

information from the APs and determines its own location. This is called a self-

localization and it provides more privacy to the user. The other approach is called a

remote-localization, where MU transmits information to the stationary base stations

(BSs), which process the received information.

In the RADAR system, multiple base stations provide overlapping coverage in the area

of interest and they are used to process the information transmitted by the MU. The

effect of MU’s orientation is considered as a systematic source of error. Thus during

the calibration phase MU records fingerprints forming the radio map M augmented

with the MU’s orientation θi = di. Mean, median and variance are calculated for

every combination of location and orientation, which presumably reduces the effect of

random fluctuations in the radio map. Only stationary case is considered and during

the location estimation phase, measurements are compared to the radio map using the

Euclidean distance. Varying K in KNN method implies better results in small values

of K, because for large K, points far away from the true location corrupts the estimate.

On the other hand, Bahl and Padmanabhan [2000] did not benefit dramatically even of

small values of K. K nearest neighbors in the signal space are not necessary K distinct

locations in the physical space, because of the fingerprints corresponding the different

orientations of the user. Bahl and Padmanabhan [2000] decrease the effect of the

orientation by considering measurements in the radio map only to the direction with

the maximum RSSI value of the certain AP. The test data, i.e. measurement collected

during the location estimation phase, is also collected with that same direction. This

method increase the accuracy of KNN method and varying K is also more effective in

this approach. Thus the orientation of the MU is proposed to have significant effect

on the RSSI at a given location, the variation up to 5 dBm is presented.

Bahl and Padmanabhan [2000] also studied the impact of varying the number of CPs

considered during the location estimation phase. As expected, error decreased when

the density of fingerprint increased, but the decreasing rate slows down during the

thickening of the radio map, which was also presented in Li et al. [2006]. This is


due to the measurement error of the RSSI values in the radio map, which generates

inaccuracy in the location estimation phase. This result motivates the careful selection

of the CPs, because only little benefit is achieved by increasing the density larger than

a threshold. Prasithsangaree et al. [2002] and Kontkanen et al. [2004] suggest to build

a radio map with varying density by refining the grid in the areas where the number

of obstructions is smaller. However, it is very hard to estimate adequate density to

ensure desired accuracy, because in the real world situations the signal propagation

environment may vary significantly.

2.5 Probabilistic location estimation

Deterministic methods, such as the KNN method, use the sample mean during the

location estimation phase to calculate the distances in the signal space. That does

not exploit raw data collected in the calibration phase comprehensively, because it is

the complete representation only in the case of zero variance. Statistical approach or

probabilistic approach [Roos et al. 2002] exploits the sample of measurements collected

during the calibration phase more efficiently than the deterministic methods. In this

section the static location estimation is considered, the distribution of the state is

computed from the simultaneously taken measurements, which are independent of the

previous and the future measurements or location estimates.

The idea in the probabilistic methods in location fingerprinting is to compute the

conditional pdf of the state x given measurements y = y. This can be done by using

Corollary 2.5, which follows from the Bayes’ rule.

Definition 2.4 (Conditional pdf) The conditional pdf of the state x given mea-

surements y = y, i.e. the pdf of the random variable x|y = y, is defined as

px|y(x|y) =px,y(x, y)

py(y),

where the denominator py(y) > 0.

Corollary 2.5 (Bayes’ rule) From Definition 2.4 follows, that

px|y(x|y) =py|x(y|x)px(x)

py(y)=

py|x(y|x)px(x)∫

py|x(y|x)px(x)dx.

For simplicity, in this work the pdf of the random variable x can be also denoted as

px(x) , p(x).


The function p(y|x) in (2.5) is called the likelihood function of the measurement y = y

and the function p(x) is called a prior, which is assumed to be independent of the

measurements. The multiplication of the prior and the likelihood divided by the

normalizing constant p(y) gives the conditional pdf p(x|y), which is called the posterior

of the state x.

In the location fingerprinting, the prior distribution p(x) is often assumed to be a

uniform distribution [Roos et al. 2002]. In this section this approach is considered. In

Chapter 3 the posterior distribution is computed as a time series, when the prior is

computed using the state model and the posterior from the previous time step. In this

section the emphasis is on the computation of the likelihood function. As discussed, in

the probabilistic approach of this work, the state x is modeled as a continuous random

variable. The fingerprints are collected at the rectangular cells of the grid B; the

calibration is done at the CPs, which are at the centers of the cells. It is assumed that

the measurements collected at the center of the cell represent the distribution of the

RSSI at the whole cell. This assumption sets the likelihood function to be constant at

each of the cells Bi.

Several approaches to compute the likelihood function are considered. Roos et al.

[2002] present two non-parametric approaches to determine the likelihood function,

namely the histogram method and the kernel method. Obtaining the likelihood p(y|x)

with these methods are discussed in Sections 2.5.1 and 2.5.2, respectively. The para-

metric approach is presented in Section 2.5.3. These three approaches rely on the

measurement model and especially on the non-parametric and parametric approxima-

tions of the normalized fingerprint histograms.

In Section 2.5.4 the likelihood function is computed by using directly the whole his-

tograms. The fingerprint histograms Haijand the measurement histograms from the

location estimation phase Hy are compared and the likelihood is computed from the

similarity value between these histograms.

Bayesian framework is used to compute the piece-wise constant posterior distribution

of the state and different estimates can be computed from that distribution. The first

natural choice is to seek the maximum a posterior estimate of the state, that is

xMAP = argmaxx

p(x|y). (2.11)

Because the uniform prior is used in this section, the maximum a posterior estimate

equals the maximum likelihood estimate, that is

xMAP = xML.


Because the posterior distribution is a piecewise constant function, the MAP estimate

is actually the cell Bi that has the maximum likelihood in the grid. Thus to produce

the point estimate, the expected value of the location within a cell is taken as the

MAP estimate, that is,

xMAP =

∫

Bi

x

|Bi|dx = pi,

where the vector pi is the center of the cell Bi, i.e. the calibration point. The MAP

estimate can also produce k cells, in which case the point estimate is computed as

xMAP =

∫

B

x

|B|dx

=

∑ki=1

∫

Bixdx

∑ki=1 |Bi|

=

∑ki=1 |Bi|pi∑k

i=1 |Bi|,

where B =k⋃

i=1

Bi.

The other common choice is to compute the expected value of the posterior distribu-

tion, that is

xMEAN = E(x|y)

=

∫

xp(x|y)dx

=

∫

x

N∑

i=1

βi

|Bi|χBi

(x)dx

=

N∑

i=1

βi

|Bi|

∫

Bi

xdx

=N∑

i=1

βipi.

(2.12)

The estimate xMEAN is also the minimum mean square error (MMSE) estimator, which

is shown next. The MMSE estimator can be written as

xMMSE = argminx

[E(||x − x||22 |y = y)

]

⇒ = argminx

[E((xTx − xT x − xTx + xT x)|y = y

)]

⇒ = argminx

[− 2 E(xT |y = y)x + xT x

].

(2.13)


The gradient is zero at the critical point of the quadratic function in (2.13), that is

∇(− 2 E(xT )x + xT x

)= 0

−2 E(x) + 2x = 0

x = E(x).

Because ∇(−2 E(x)+2x

)= 2I is a positive definitive matrix, the point x = E(x) is the

global minimum of the function E(||x − x||22 |y = y). Thus xN−W = xMMSE = xMEAN.

Most of the probabilistic methods discussed in this work are based on the measurement

model, which is formulated next. The fingerprints hold information about the signal

characteristics across the cells. The normalized histogram pattern measured at the

CPi from the APj is interpreted as the distribution of the RSSI sample from the APj

at the cell Bi. Thus the measurement model can be written as

y = h(x) + v(x), (2.14)

where y includes the received measurements during the time step of the location

estimation phase, h is a known vector valued measurement function and v is the

measurement noise. The function h is assumed to be defined by the fingerprints. To

clarify the measurement function, let us assume that the n-dimensional measurement

vector includes RSSI measurements from the n APs. Then the measurement model is

y1

...

yn

=

ai1

...

ain

+

vi1

...

vin

,

The measurement function h represents the ”true value” of the measurement. In lo-

cation fingerprinting the ”true value” is based on the fingerprints collected during the

calibration phase, because that is the only available information about the charac-

terization of the RSSI over the grid B. Thus the ”true value” or the domain of the

measurement function h is the set of sample averages aij measured at the CPs. The

RSSI histograms measured at the calibration phase are included to the distribution

of the measurement noise v. To obtain zero mean measurement noise, the pdf of

the discrete random variable v at the cell Bi can be formulated with the fingerprint

histograms, that is

pvij(x) =

Hvij(xj) , x = xj , j = 1, 2, . . .

0 , otherwise,(2.15)

where Hvij(xj) = hN

t , which is the bin height of the histogram Haijcorresponding the

bin bt − aij = xj . The superscript N denotes the normalized bin heights. Histogram

Hvijis almost the same as the fingerprint histogram Haij

, but it satisfies

E(v) = 0.


The histograms Hvijare used in the next section to compute the likelihood at the cell

Bi.

2.5.1 Histogram method

As discussed in Section 2.2.4 it is possible to present the raw data measured during the

calibration phase as the collection of RSSI histograms Haijcorresponding to different

cells and APs. Several RSSI histograms with frequency of sample occurrence are

obtained at every calibration point. The number of histograms per location equals

the number of APs in range at the cells. As discussed in Section 2.5, normalized

histograms, denoted as HNaij

are interpreted as the discrete distributions of the received

RSSI samples.

The independence of the random variables is needed next to compute the likelihood

at the cells.

Definition 2.6 (Independence) Random variables x1, . . . ,xn are independent, if

Fx1,...,xn(x1, . . . , xn) =

n∏

i=1

Fxi(xi) ∀x1, . . . , xk ∈ R

n,

where Fx(x) is the cumulative probability density function of the random variable x.

Theorem 2.7 Random variables x1, . . . ,xn are independent, if and only if

px1,...,xn(x1, . . . , xn) =

n∏

i=1

pxi(xi) ∀x1, . . . , xk ∈ R

n.

P roof. See [Kaleva 2008].

Theorem 2.7 holds also for the discrete random variables, so it holds also for the

measurement noise.

The components of the random vector vi are assumed to be mutually independent,

when according to Theorem 2.7 the measurement likelihood at the grid cell Bi can be

computed as

py|x(y|x) = pvi(y − h(x))

=∏

j

pvij(yj − hj(x))

=∏

j

pvij(yj − aij)

, p(y|i).

(2.16)


The first equality in (2.16) follows from the transformation of the measurement equa-

tion (2.14) into the likelihood form. Thus the known distribution of the measurement

noise can be exploited in the computation of the likelihood.

The piecewise constant posterior distribution can be now formulated with the likeli-

hood, that is

p(x|y) =

N∑

i=1

βi

|Bi|χBi

(x), (2.17)

where

χBi(x) =

0 ,x /∈ Bi

1 ,x ∈ Bi.

and the coefficients

βi =|Bi|p(y|i)

∑Nj=1 |Bj|p(y|j)

.

Now the likelihoods pvij(y − h(x)) can be evaluated by using the distribution of v,

that is

pvij(yj − hj(x)) = Hvij

(yj − aij),

which means the bin height in the histogram Hvijcorresponding the bin yj − aij .

Histogram approach can involve also the adjustment of the histogram parameters, such

as the histogram bin width b, which changes the histogram pattern. This is observed

in Section 4.4.1.

Roos et al. [2002] emphasize the importance of the prior, compared to an non-Bayesian

approach where only the likelihoods are considered. Often the calibration phase is

incomplete, when the bins with zero probability can occur in some parts of the his-

tograms. Roos et al. [2002] prevent that by applying uniform prior, which assigns a

small fraction of the total probability mass, |aij |−1, to all bins. to prevent bins with

the zero probability.

The other way to handle the problem of incomplete data is discussed in next section,

where the non-parametric approximation is applied to the histograms.

2.5.2 Kernel method

As discussed in Section 2.5, the normalized RSSI histograms are interpreted as the

distribution of the measurement and the measurement likelihood can be obtained


with the normalized histograms. However, these RSSI histograms are discontinuous

and sensitive to disturbances due to the varying MU orientation, walking people and

so on. The calibration time is also not necessarily long enough to produce reliable

fingerprints. Thus it is reasonable to smooth the discrete histograms to a continuous

function and to fill the incomplete calibration data.

The kernel density estimation makes it possible to interpolate the data to the entire

signal strength space and fill the possible incorrect gaps in the RSSI histograms. The

kernel density estimation can be considered as a more general density estimation tech-

nique than histogram approximation; a histogram can be thought as a collection of

point samples from a kernel density estimate, if the kernel is a hypercube with the

edge width equal to the histogram bin width.

The kernel method is derived next [Duda et al. 2001]. The kernel density or Parzen

window estimation is a widely used data-interpolation technique that estimates the

underlaying pdf of the given realization of the random sample. The objective is to

estimate the probability P that a random variable x with the pdf p(x) falls in a region

A and it is given as

P =

∫

A

p(x)dx.

Let x1, . . . , xn be drawn independently from the distribution p(x). The probability

that k of these n samples fall in A is given by the binomial law, that is

Pk =

(n

k

)

P k(1 − P )n−k,

and the expected value for k is E(k) = nP . Thus the probability P is approximated

as

P ≈ k

n. (2.18)

The accuracy of the estimate increases as the number of samples n increases. On the

other hand, if the region A is assumed small enough, then

P ≈ p(x)V, (2.19)

where x is the point inside the region A and V is the volume of the region A. Equations

(2.18) and (2.19) yield the space-averaged density estimate, that is

p(x) ≈ k

nV.

The accuracy of the estimate is increased, if the volume V is decreased. However, if

the volume V is decreased enough, then it does not contain any of the samples xi. Let


An be the sequence of areas containing x and the sample size to be used is n. Let Vn

be the volume of region An, kn be the number of samples falling in region An, and

pn(x) be the nth estimate for p(x), that is

pn(x) =kn

nVn. (2.20)

The convergence limn→∞ pn(x) = p(x) holds, if the following conditions are satisified:

limn→∞

Vn = 0 (2.21)

limn→∞

kn = ∞ (2.22)

limn→∞

kn

n= 0. (2.23)

Two common approaches are presented to generate the sequence of volumes Vn which

satisfies the conditions (2.21) - (2.23) [Duda et al. 2001]. The first approach is to

specify kn as a function of n, and the volume Vn is enlarged until it contains kn data

samples. This approach leads to the kn nearest neighbor classification method, which

however differs from the KNN method discussed in Section 2.4.2. The other approach

is to specify the volume Vn as a function of n, such as 1√n, which satisfy the conditions

discussed.

The presented idea of density estimation is now expanded to the kernel density es-

timation approach [Duda et al. 2001]. Suppose that the region An is d-dimensional

hypercube, whose volume is given by

Vn = hdn, (2.24)

where hn is the length of an edge of the hypercube. Let the window function

φ(u) =

1 , |uj| ≤ 12, j = 1, . . . , d

0 , otherwise

define a unit hypercube centered at the origin. Thus the number of samples falling

into the hypercube centered at x is

kn =

n∑

i=1

φ

(x − xi

hn

)

. (2.25)

Substituting (2.25) into (2.20) yields the density estimate

pn(x) =1

n

n∑

i=1

1

Vnφ

(x − xi

hn

)

. (2.26)

The kernel density estimation is obtained by allowing more general class of window

functions. Instead of simply counting the number of random samples that fall within


a fixed volume surronding x, we can weight the count for each random sample by

its kernelized distance from x. This can be achieved by replacing the unit hypercube

window function with the kernel function K(·), which satisfies the following conditions.

Definition 2.8 (Kernel function) A kernel function is a non-negative integrable

function satisfying the conditions

K(−x) = K(x) (2.27)∫

K(x)dx = 1. (2.28)

Hence the kernel density estimate is

pn(x) =1

n

n∑

i=1

1

VnK

(x − xi

hn

)

. (2.29)

Because of (2.24), the kernel density estimate (2.29) also satisfies the conditions (2.27)

− (2.28). If K(x) is a kernel function, then the function Khn= h−d

n K(h−1n x) is also a

kernel. Thus hn can be used to scale the kernel to be appropriate for the data and to

smoothen the density estimate, i.e.

limhn→0

Khn(x − xi) = δ(x − xi),

where δ(x − xi) is the Dirac delta function centered at the sample xi. Thus if hn

approaches zero, the density estimate pn(x) approaches the sum of delta functions

centered at the samples. In the 1-dimensional case common choices for kernel functions

are the Gaussian kernel

K(x) =1√2π

exp

(

− x2

2

)

(2.30)

and the exponential kernel

K(x) =1

2exp(− |x|). (2.31)

The kernel method in the location fingerprinting is discussed next. The kernel method

is also a non-parametric method like the histogram method, because the idea is to

estimate the underlying pdf from the sample pattern. As discussed, the idea is to

impose a probability mass to a ”kernel” around each of the samples atij . Thus the

RSSI histogram Haijis approximated as the mixture of kernel functions, which leads

to a smooth, possibly multimodal distribution of the measurement. The computation

of the likelihood is done similarly as in the histogram method by using the equation

(2.16), where

p(y|i) =1

|aij |h

|aij |∑

t=1

K

(y − at

ij

h

)

,


where K(·) denotes the kernel function, atij is the tth element of the vector aij and

h > 0 is a smoothing parameter, which determines the width of the kernel.

Kushki et al. [2005] propose a modified kernel method compared to a method of Roos

et al. [2002]. The method is called Nadaraya-Watson kernel regression, which idea

is to approximate the joint density p(x, y) using kernel density estimation and to

derive conditional probability p(x|y). If one assumes Ni = Ny = N ∀i, then the joint

distribution p(x, y) can be approximated as

p(x, y) =1

M

M∑

i=1

1

hNy

h2x

K

(y − ai

hy

)

K

(x − pi

hx

)

, (2.32)

because the location x is 2-dimensional. Parameters hx and hy are the smoothing

parameters. The probability p(y) is also approximated by using kernel density esti-

mation,

p(y) =1

MhNy

M∑

i=1

K

(y − ai

hy

)

. (2.33)

Kushki et al. [2005] obtained best results by using the exponential kernel function

(2.31).

A conditional probability p(x|y) is obtained by using (2.32), (2.33) and the Bayes’ rule

(Corollary 2.5). Computing the expectation of the posterior leads to summing over

the set L and to the Nadaraya-Watson estimator of the location,

xN−W = E(x|y)

=

∫

B

x p(x|y)dx,

=

∫

B

xp(x, y)

p(y)dx

=

∫

B

xM−1h−N

yh−2

x

∑Mi=1 K

(y−ai

hy

)K(

x−pi

hx

)

M−1h−Ny

∑Mi=1 K

(y−ai

hy

) dx

=

∑Mi=1 K

(y−ai

hy

) ∫

Bxh−2

xK(

x−pi

hx

)dx

∑Mi=1 K

(y−ai

hy

)

=

∑Mi=1 K

(y−ai

hy

)pi

∑Mi=1 K

(y−ai

hy

)

=

M∑

i=1

pi ωi,

where weights ωi are

ωi =K(

y−ai

hy

)

∑Mj=1 K

(y−aj

hy

) .


In Section 2.5 it is shown that the Nadaraya-Watson estimator xN−W is actually the

minimum mean square error (MMSE) estimator xMMSE of the state x.

2.5.3 Parametric approximation of measurement noise

If the calibration time is long, the RSSI histograms can be massive. Thus it would

be tempting to fit some known distribution into the RSSI sample histogram, at which

point the likelihood would be easy to compute using the pdf of the known distribution.

This approach is called parametric method. Unfortunately, the distribution of the

RSSI varies as a function of the location and time, because of the complex radio signal

propagation environment. RSSI histogram can be symmetric, asymmetric, unimodal

or multimodal, and particularly it is rarely Gaussian. Thus it challenging to provide

good results, if measurement error is assumed to have known distribution at every

location.

As discussed, the measurement likelihood p(y|x) can be computed, if the pdf of the

additive noise v is known. In the literature RSSI histograms have been approximated

with a lot of different distributions. The most common distributions used are Gaus-

sian [Haeberlen et al. 2004] and log-normal [Kaemarungsi 2006] distributions. The

Gaussian pdf is a symmetric function, but the log-normal pdf can be a right-skewed

function. However, it can be exploited with left-skewed RSSI histograms, when the

absolute values of the signal strengths are used. The left-skweness of some of the

RSSI histograms have been presented, which is due to the discovery that the varia-

tions of the weaker RSSI values are larger than the stronger RSSI values. However,

this assumption can be problematic in the outer limits of the AP’s range. The MU

has a lower limit to the received RSSI and when exceeded the RSSI values are too

low for the MU to register. This limitation of the MU restricts the tail of the RSSI

histograms from the left. Thus the RSSI histograms can be left skewed, symmetric or

right-skewed, depending on the distance and obstacles between the MU and APs.

In this work the parametric approach involves using the individual samples ytj as mea-

surements, which is a similar approach as in the histogram and the kernel methods. In

the parametric method the histograms HNaij

are approximated as a known function and

the likelihood is computed with that function. As discussed, the Gaussian approxima-

tion with the mean µ and the variance σ2 is one choice as a parametric approximation

of the RSSI histogram. The parameters of the Gaussian pdf are computed from the

fingerprints, that is,

µij = aij

σ2ij = σ2

ij .


1 s averages can be used also used as a measurement with the Gaussian approximation

of the RSSI histograms. This is based on Theorem 2.10.

Definition 2.9 (Convergence in distribution) The sequence of random variables

x1,x2, . . . converges towards the random variable x in distribution, if

limn→∞

Fxn(x) = Fx(x)

for every point of continuity of the cumulative probability density function Fx. That is

denoted as xnD−→ x.

Theorem 2.10 (Central limit theorem) Let x1,x2, . . .xn be i.i.d. random vari-

ables with E(xi) = µ and V(xi) = σ2. Then

1

n

n∑

i=1

xiD−→ x ∼ N

(µ,

σ2

n

).

P roof. See [Kaleva 2008]. 2

Theorem 2.10 states that the distribution of the sample average approaches the Gaus-

sian distribution, when the sample size is increased. Thus the normalized RSSI his-

togram is approximated as the distribution of the sample average.

Other parametric approximations can be made to the histograms Haij. The inverse

function approximation

f(x) = |x|−1

or the exponential approximation

f(x) =1

2e−|x|

can be made. However, the inverse function has to be modified to cover the case

ytj = aij, because otherwise the likelihood value will be infinity in that case. For

example, following modification can be made,

pvij(x) =

0 , |x| > t

C(2 − |x|) , |x| ≤ 1C|x| , otherwise

where the parameter t > 1 is chosen large enough to assure that the pdf gets non-zero

values. The constant C is chosen to make the function integrate to unity,

1 =

∫ ∞

−∞pvij

(x)dx

1 = 2C

∫ t

1

x−1dx + 3C

⇒ C = (2ln(t) + 3)−1.


As discussed, the log-normal distribution has been proposed as the approximative

distribution of the absolute RSSI values. The log-normal pdf is the overall-term for

any random variable whose logarithm has a Gaussian distribution. It is a single-tailed

pdf, which is the reason why it is used to approximate RSSI histograms. If the random

variable u′ = log(u) ∼ N(µ, σ2) is Gaussian distributed, then u ∼ log-N(µ′, σ′2) has

a log-normal distribution. The parameters µ′ and σ′2 are the mean and variance of

the logarithm of the random variable, respectively. The base of the logarithm does

not matter, but the natural logarithm is the most commonly used. The pdf of the

log-normal distributed random variable can be derived from the pdf of the Gaussian

distributed random variable. Let

x = ln(u), (2.34)

where the random variable x has a Gaussian pdf. The incremental probabilities should

be equal for both random variable x and u [Pohjavirta and Ruohonen 2005], that is

p(x)dx = p(u)du. (2.35)

Taking the derivatives from equation (2.34) yields

dx =1

udu. (2.36)

Substituting (2.36) into (2.35) yields the pdf of the log-normal distribution, that is

pu(u) =px(ln(u))

u.

2.5.4 Histogram comparison method

The full raw data (2.6) received at every time interval can be presented as histograms

as discussed in Section 2.3. In the histogram comparison approach, the objective is

to answer the question, whether the measurement histogram HNyij

and the fingerprint

histogram HNaij

come from the same distribution. The objective is to compute the

likelihood p(y|x) by comparing the histogram HNyij

to the histogram HNaij

. Intuitively

the likelihood should be large, if the measurement histograms Hyijcome from the

same distribution as the fingerprint histograms Haij, and small, if the samples are not

from the same distribution.

In this work, two approaches are presented. The fingerprint histograms HNaij

are inter-

preted as the distributions of the measurement at the corresponding cell (Section 2.5).

Similarly, the measurement histograms HNyij

are intepreted as the distributions. Thus

it is possible to compare the measurement histograms with the fingerprint histograms


by using the probability density distance measures [Sirola 2007]. The common prob-

ability density distances of the histograms HNaij

and HNyij

are listed in the Table 2.1,

where the functions f and g denote the histograms HNyij

and HNaij

, respectively.

Table 2.1: Probability density distances

Name d(f, g)

Infinity norm sup |f(x) − g(x)|Lissack-Fu (LF)

∑ |f(x) − g(x)|pBhattacharyya

∑√

f(x)g(x)

Kullback-Leibler (K-L)∑

f(x)f(x)g(x)

Simandl 1 −∑min(f(x), g(x))

2.6 Deterministic location estimation methods in

the probabilistic framework

In this section the deterministic methods are explored from the probabilistic point of

view and the relations between the methods are discussed.

There is a link between the nearest neighbor method (Section 2.4.2) and the kernel

method. In the nearest neighbor approach only the sample means ai of the fingerprints

are considered. In that case, kernel method means only assigning one kernel function

around that sample mean. Let us first consider the mean estimator of the posterior. If

the kernel width h approaches zero, the likelihood related to Bi and APj approaches the

function δ(y− aij). Then the difference between the location corresponding maximum

of the likelihood and the other locations increases. Hence nearest neighbor method is

actually a special case of the kernel method and in the case of the euclidean distance

measure the corresponding estimate of the MU’s location is

xMEAN = pj, j = argmini

||y − ai||2 . (2.37)

If the MAP estimator is used, the kernel function does not need to approach delta

function to see the relation between these methods. In that case the location esti-

mate xMAP has the form (2.37). Kernel method has a direct connection also with the

parametric approximation of the measurement noise. Then the function used in the

parametric approximation equals the single kernel function, which is assigned around

the sample mean.


By using the parametric approach to the histogram density estimation, the link be-

tween the weighted K-nearest neighbor method (Section 2.4.3) and the probabilistic

method is shown. In WKNN method the location estimate x is computed as weighted

average of the coordinates of the K nearest neighbor and the vector of sample av-

erages y is used as a measurement. If the RSSI histograms HNaij

are approximated

to present the distributions of the sample averages y and the areas of the cells are

assumed to be identical, then the weight function of the WKNN method corresponds

in the probabilistic approach to the approximation of the pdf of the measurement

noise vi. Moreover, the WKNN estimator equals the truncated conditional expecta-

tion estimator or the truncated mean estimator in the probabilistic approach. The

truncated mean involves the computation of the mean after discarding some parts of

a probability density function. By using the equation (2.12), the truncated mean can

be formulated as

ETR(x|y) =K∑

i=1

βTRi pi,

where βTRi are the K largest βi normalized. As discussed, the weight wi in equation

(2.10) can be for example the inverse of the signal distance. In that case,

wi = βTRi

= pvi(y − ai)

= [d(y − ai)]−1,

which states that Hvi≈ [d(y − ai)]

−1. However, the pdf of the measurement noise

has to be modified to handle the argument value zero, which occurs when the signal

distance d(y − ai) = 0. Moreover, the function has to be multiplied by a constant

to make the function integrate to unity. Thus the weighted average corresponds to

the truncated mean of the posterior distribution p(x|y) of the state x obtained in the

probabilistic approach; particularly choices K = M and K = 1 lead to MMSE and

MAP estimators in the probabilistic approach.

Chapter 3

Filtering Approach

In this chapter the locationing accuracy is enhanced by exploiting also the previous

measurements in addition to the current ones.

3.1 Filtering

In Chapter 2 the static location estimation is considered, i.e. only the current mea-

surements have been used to compute the conditional probability p(x|y) and only the

location of the MU is included to the state x. In the filtering approach the conditional

pdf p(x|y) is computed in the time series and the desired location estimate is com-

puted from the obtained distribution. All the earlier measurements in addition to the

current ones are used to obtain the location estimate. That leads to the computation

of the conditional pdf px|y1:k(xk|y1:k) at every time step, which improves the position

accuracy compared to the static location estimation. The notation y1:k is used to

denote measurements up to current time index k.

Definition 3.1 [Stochastic process] Let (Ω,F ,P) be a probability space and T a pa-

rameter set. Stochastic process is a mapping x : Ω×T → Rn, such that for every fixed

t ∈ T , x(·, t) is a random variable, which is usually denoted xt or x(t).

Definition 3.2 [Markov process] A Markov process is a sequence of random variables

x1,x2,x3, . . . with the Markov property, that is

p(xn+1 = x|xn = xn, . . . ,x1 = x1) = p(xn+1 = x|xn = xn).

40

CHAPTER 3. FILTERING APPROACH 41

Definition 3.3 [White noise] A stochastic process is said to be white noise if

p(xk+1|xk) = p(xk+1).

The usage of the earlier measurements is based on the state model, which matches

the previous state with the current one. State models are discussed more in the next

section, but now the general formulation is presented. The state model can be used to

compute the location estimate, even if no new measurements are received. The general

form of the dynamic state model can be formulated as

xk = fk−1(xk−1) + wk−1, (3.1)

where xk = x(tk), f(·) is the state transfer function and wk is the state model error.

The subscript k refers to the time instant tk. In this chapter, the state xk is interpreted

as a stochastic process.

The measurement model is (2.14) described in Section 2.1 with the augmented time

index k, that is

yk = hk(xk) + vk(xk), (3.2)

The state noise vk and the measurement noise wk are assumed to be zero mean white

noise. Thus the errors are independent from the past errors and the state’s conditional

probability can be computed recursively without reusing the whole measurement his-

tory at each time step [Bar-Shalom and Li 2001]. To compute the conditional pdf of

the state, the initial state x0 is also needed in addition to the state and measurement

models. Once the conditional state distribution is obtained, a location estimate with

the error covariance matrix can be computed according to some chosen optimality

criterion, as discussed in Section 2.5.

3.2 State models

In this work three state models are used. The first state model exploits the floorplan of

the building. The connection information of two fingerprints is stored in the parameter

α in the radio map (2.1), that is

αij =

0 , no connection between Bi and Bj

w(dij) , connection between Bi and Bj ,


where w(dij) is the weight function, which depends from the distance dij between the

locations pi and pj . The most obvious choice is to assign w(dij) = 1. The weight can

be also evaluated from the Gaussian distribution, that is

w(dij) = pu(dij),

where u ∼ N2(pi, σ2I). Thus the matrix [G]ij = αij is obtained and it forms a graph

of the connections between the cells.

At each time instant, the MU may change its location from the current cell to another

cell, or remain in the same cell. The changes of the cells are called transitions, and the

probabilities associated with various cell-changes are termed transition probabilities.

The matrix T is a probability transition matrix, which element πij = [T]ij is the

probability of the transition from the cell Bi to the cell Bj. The matrix T is obtained

by dividing each element of the matrix G in the corresponding column sum, that is

[T]ij =αij

1Tgj,

where 1 is the column vector with all entries equal to 1 and gj is the jth column of

the matrix G. The matrix T describes a Markov process, which means that given the

present cell, future cells are independent from the past cells.

The state model described can be used in the Bayesian filter (Section 3.3). The prior

pdf of the state in (3.7) can be formulated as

pxk|y1:k−1(xk|y1:k−1) =

∑

i

βi−k

|Bi|χBi

(xk).

Let the transition probabilities be denoted as

πij , P (xk ∈ Bi|xk−1 ∈ Bj),

which means the probability P (xk ∈ Bi), when pxk−1(xk−1) =

χBj(xk−1)

|Bj | . The transition

probabilities can be formulated as

πij =

∫

Bi

∫

pxk,xk−1(xk, xk−1)dxk−1dxk

=

∫

Bi

∫

pxk|xk−1(xk|xk−1)pxk−1

(xk−1)dxk−1dxk

=

∫

Bi

∫

Bj

pxk|xk−1(xk|xk−1)

|Bj |dxk−1dxk.


Hence

βi−k =

∫ ∫

pxk|xk−1(xk|xk−1)χBi

(xk)pxk−1|y1:k−1(xk−1|y1:k−1)dxk−1dxk

=

∫ ∫∑

j

pxk|xk−1(xk|xk−1)χBi

(xk)χBj(xk−1)

βjk−1

|Bj|dxk−1dxk

=∑

j

πijβjk−1.

Thus the state evaluation of the weights βi can be computed using the matrix equation,

that is

β−k = Tβk−1,

where β−k , βk−1 ∈ R

M .

The constant velocity model is the second state model used. It is used with the

linear Kalman filter discussed in Section 3.4.2. In the constant velocity model of this

work, the state x has also the 2-dimensional velocity in the state in addition to the

2-dimensional location. The constant velocity error is based on the Brownian motion,

denoted as β. The state x is modeled as a stochastic differential equation

dx = Fxdt + Gdβ, (3.3)

where

F =

[

02×2 I2×2

02×2 02×2

]

and

G =

[

02×2

I2×2.

]

Brownian motion needs also the diffusion matrix, which is assumed to be QC = σ2CI,

where σ2C describes how much variance needs to be added to the velocity error in

the 1 s time interval to the horizontal and vertical directions. The solution to the

equation (3.3) is presented in the literature [Jazwinski 1970] and it leads to the linear

state model, that is

xk = Fk−1xk−1 + wk−1,

where

Fk−1 =

[

I2×2 ∆tI2×2

02×2 I2×2

]

,


where ∆t = tk − tk−1. The state noise wk ∼ N(0, Qk), where the state noise covariance

matrix is

Qk =

[13∆t3I 1

2∆t2I

12∆t2I ∆tI

]

σ2C =

[13I 1

2I

12I I

]

σ2C , (3.4)

where ∆t = 1 s in this work.

The stationary state model is the third state model used. The velocity measurements

are not available and the MU is mainly located indoors, when the velocity stays often

small. The stationary state model can be formulated as discrete brownian motion,

that is

xk = xk−1 + wk−1, (3.5)

hence the prior is obtained by increasing the uncertainty of the estimate from the

previous time step, but the estimate mean does not change.

3.3 Bayesian filter

In this section the state and measurement models described in previous section are

used to derive the state’s conditional pdf given the measurements, that is

pxk|y1:k(xk|y1:k) , p(xk|y1:k).

The initial state p(x0) is needed before any measurements. Then, the Bayesian filtering

includes two phases, namely the prediction and the update phases. Let the state’s

conditional pdf p(xk−1|y1:k−1) at the time k−1 be available, then the prediction of the

state is done with the Chapman-Kolmogorov equation,

p(xk|y1:k−1) =

∫

p(xk|xk−1)p(xk−1|y1:k−1)dxk−1, (3.6)

The state evolution in the equation (3.6) can be derived from the known statistics of

the wk, that is

p(xk|xk−1) = pwk−1(xk − fk−1(xk−1)).


The update phase is considered next, the current measurement yk comes available and

is used to update the predicted state estimate p(xk|y1:k−1). The update phase is based

on the Bayes’ rule,

p(xk|y1:k) = p(xk|yk, y1:k−1)

=p(yk|xk, y1:k−1)p(xk|y1:k−1)

p(yk|y1:k−1)

=p(yk|xk)p(xk|y1:k−1)

p(yk|y1:k−1),

(3.7)

The conditional pdf p(xk|y1:k) is called the posterior distribution, the predicted state

p(xk|y1:k−1) is called a prior distribution of the state and p(yk|xk) is called a measure-

ment likelihood. The probability of the measurement p(yk|y1:k−1) can be considered

as normalizing constant and can be computed as

p(yk|y1:k−1) =

∫

p(yk|xk)p(xk|y1:k−1)dxk.

As already discussed in Chapter 2, in the static case the measurement likelihood

can be obtained from the known statistics of the measurement noise. Thus the static

estimator can be enhanced by using the state model to compute the prior distribution.

3.4 Kalman filter approach

In location fingerpinting the RSSI histograms measured at the CPs are interpreted

as the measurement model. The measurement function h(x) is a non-linear function

of the state and the measurement noise vk has a non-Gaussian pdf, which makes it

hard to compute the conditional pdf of the state. Thus the Bayesian approach is

impractical and in Section 3.4.1 the frequentistic approach is used to derive the Best

Linear Unbiased (BLU) estimator of the state. In Section 3.4.2 the measurement and

the state model are linear functions with the Gaussian noise, which makes it possible

to compute the conditional pdf of the state with the BLU estimator. In Section 3.4.3

the non-linear Kalman filter is applied to the location fingerprinting.

3.4.1 Best linear unbiased estimator

Let the expectation of the joint distribution of the state xk and the measurement yk

be

E

([

xk

yk

])

=

[

xk

yk

]


and the covariance matrix be

V

([

xk

yk

])

=

[

PxxkPxyk

PyxkPyyk

]

. (3.8)

The objective is to find the linear estimator

xk = Cyk + b

that minimizes the expected value of the mean squared error (MSE), that is

xk = argminxk

[E(eT

k ek

)],

where the estimation error ek = xk−xk. The other requirement for the BLU estimator

is the unbiaseness, that is

E(e) = 0

xk − E(Cyk + b) = 0

xk − Cyk − b = 0

b = xk − Cyk,

and hence the estimator has the form

xk = Cyk + xk − Cyk

= xk + C(yk − yk).(3.9)

Let tth row in the coefficient matrix be denoted as cTt . Thus the tth element of the

estimator is

xtk = xt

k + cTt (yt

k − ytk), (3.10)

where the index t implies to the tth element of the corresponding vector. The MSE of

the estimator (3.10) is

etk = xt

k − xtk + cT

t (ytk − yt

k).

The aim is to find the coefficients c for the linear combination cTt (yt

k − ytk), which

minimizes the expected value of the MSE, that is

xtk = argmin

xk

[E(xk − xt

k + cTt (yt

k − ytk))2]

. (3.11)

The gradient is zero at the critical point of the quadratic function in (3.11), that is

∇E(xk − xt

k + cTt (yt

k − ytk))2

= 0

E((xt

k − xtk + cT

t (ytk − yt

k))(ytk − yt

k))

= 0,(3.12)


because the order of the linear operators gradient and the expectancy can be changed.

Because ∇E((xt

k − xtk + cT

t (ytk − yt

k))(ytk − yt

k))

= (ytk − yt

k)2 > 0 the zero point of

the gradient is the global minimum, when (ytk 6= yt

k). The equation (3.12) is called the

principal of orthogonality and it has to hold for every k = 1, . . . , n, which leads to the

matrix form of the equation,

E((xk − xk + C(yk − yk))(yk − yk)

)= 0

Pxyk− CPyyk

= 0

C = PxykP−1

yyk.

(3.13)

Substituting (3.13) into the equation (3.9) leads to the BLU -estimator,

xk = xk + PxykP−1

yyk(yk − yk). (3.14)

The corresponding MSE matrix is

E(eke

Tk

)= E

((xk − xk + C(yk − yk))(xk − xk + C(yk − yk))

T)

= E((xk − x)(xk − x)T

)− E

((xk − x)(yk − y)T P−1

yykPyxk

)

− E(Pxyk

P−1yyk

(y − y)(x − x)T)

+ E(Pxyk

P−1yyk

(y − y)(y − y)T)P−1

yykPyxk

= Pxxk− Pxyk

P−1yyk

Pyxk.

It can be shown that if the state xk and the measurement yk have Gaussian distri-

butions and the state and measurement functions are linear, then the BLU estimator

(3.14) computes the conditional pdf of the state. That leads to the equations of the

Kalman filter (Section 3.4.2). [Bar-Shalom and Li 2001].

The BLU estimator is commonly used also for the non-linear measurement or state

models by using different approximations, which lead to the non-linear Kalman filter.

One gets different non-linear Kalman filters, when the different approximations are

done to handle the non-linearity of the system model. The two most common non-

linear Kalman filters are Extended Kalman Filter (EKF) [Bar-Shalom and Li 2001]

and the Unscented Kalman Filter (UKF) [Julier and Uhlmann 2004].

3.4.2 Kalman filter

In this section the linear stationary state model (3.5) is used with the linear mea-

surement model. In addition, the state and measurement errors are assumed to have

Gaussian pdf. These assumptions of the dynamics of the system are called the linear-

Gaussian assumptions of the system, which makes it possible to exploit the Kalman

filter (KF) algorithm.


The measurement model is based on the static estimator discussed in Chapter 2.

The static estimator provides the conditional pdf of the state, which is a piecewise

constant distribution (2.17). Thus the static estimator provides an expectation of the

posterior, denoted as xS,MEANk , and a covariance matrix RS

k , which can be used as a

linear measurement to the system. The measurement noise is approximated to have

zero mean Gaussian pdf with the covariance obtained from the static estimator. This

results in the measurement model

yk = Hkxk + vk, vk ∼ N(0, Rk),

where the measurement matrix Hkxk = I and Rk = RSk. The BLU estimator derived in

Section 3.4.1 can be simplified to the Kalman filter algorithm with the exploitation of

the linear-Gaussian assumptions. The expected value of the state, or the prior state,

simplifies according to the stationary state model, that is

xk = Fk−1xS,MEANk

where the state transition matrix Fk−1 = I. Similarly, the expected value of the

measurement simplify to the form

yk = Hkxk.

The covariance of the prior, denoted as P−k , and the covariance of the posterior, denoted

as Pk, are derived next.

V

([

xk

yk

])

= V

([

xk

Hkxk + vk

])

= V

([

I

Hk

]

xk

)

+ V

([

0

vk

])

=

[

I

HTk

]

P−k

[

I HTk

]

+

[

0

I

]

Rk

[

0 I]

=

[

P−k P−

k HTk

HkP−k HkP

−k HT

k

]

+

[

0 0

0 Rk

]

=

[

P−k P−

k HTk

HkP−k HkP

−k HT

k + Rk

]

,

(3.15)

where

P−k = Pxxk

= Fk−1Pk−1FTk−1 + Qk−1

(3.16)

according to the linear state model.


Substituting (3.15)-(3.16) into the BLU estimator leads to the Kalman filter algorithm,

that is

xk = Fk−1xk−1 + PxykP−1

yyk(yk − yk)

and the MSE matrix is

E(eke

Tk

)= Pk

= Pxxk− Pxyk

P−1yyk

Pyxk

= P−k − P−

k HTk (HkP

−k HT

k + Rk)−1HkP

−k .

In this work, stationary state model is used and the state estimate from the static

estimator is used as a linear measurement. Thus the state and measurement matrices

are Fk = Hk = I.

3.4.3 Non-linear Kalman filter

As discussed, in location fingerprinting the measurement model is a non-linear func-

tion with the non-Gaussian error distributions. Thus the BLU estimator can not be

simplified to the Kalman filter algorithm. However, because the number of CPs is

finite, it is possible to compute the unknown quantities of the BLU estimator without

any additional approximation, as for example the linearization of the state and mea-

surement models in EKF. This leads to the approach which combines the non-linear

Kalman filter and location fingerprinting in a novel way.

The aim is to compute the unknowns (3.8), i.e. the expected value and the covari-

ance matrix of the joint distribution of the state and the measurement. The discrete

fingerprint histograms form the measurement model, which is a piecewise continuous

function defined in the grid B with each grid cell Bi centered by the calibration point

pi. Unless any parametric approximation is not done to the RSSI histograms, the

measurement yk is a discrete random variable contrary to the state xk, which is a

continuous random variable. The accuracy of the measurement model depends on the

calibration phase, i.e. the fingerprints measured before the location estimation phase.

In the ideal radio map fingerprint measurements are done continuously at every loca-

tion and they include the true distribution of the RSSI at every location. However,

this is not the case; the ideal radio map is approximated with the finite amount of CPs

and the finite measurement time interval at each CP. Thus fingerprinting algorithms

are always based on the approximative measurement model, but the accuracy can be

enhanced by increasing the density of the grid and by increasing the time interval of

the fingerprint measurements (Section 4.4.2).


As discussed, filtering approach includes two phases, the prediction and the update

phase. The prediction phase is considered first. The conditional pdf of the state is

approximated at the time tk as a Gaussian pdf with the mean xk−1 and the covariance

Pk−1. Then, the stationary state model is applied to increase the covariance Pk−1.

This smooth Gaussian distribution is integrated over the each cell, that is

αi =

∫

Bi

p(xk−1|y1:k−1)dxk−1. (3.17)

This leads to the piece-wise constant prior state distribution, where each cell Bi has a

constant density, that is

p(xk|y1:k−1) ≈∑

i∈Lf

αi

|Bi|χBi

(xk). (3.18)

The summing in (3.18) is computed over the set Lf , which is defined as

Lf = i|Nyk⊆ Nai

. (3.19)

Thus the prior state approximation (3.18) is dependent on the set of feasible cells Bi,

which satisfy (3.19).

The unknown quantities xk, yk, Pxxk, Pxyk

, Pyykof the BLU estimator are derived next

using the prior state distribution (3.18). The expected value of the state is computed

over the set Lf ,

xk =

∫

xkp(xk|y1:k−1)dxk =

∫

xk

∑

i∈Lf

αi

|Bi|χBi

(xk)dxk

=∑

i∈Lf

αi

|Bi|

∫

Bi

xkdxk =∑

i

αipi,

(3.20)

because the expected value of the location in the rectangular cell Bi is the center of

the cell pi. Similarly the expected value of the discrete measurement is

yk =∑

yk

ykp(yk) =∑

yk

yk

∫

p(yk|xk)p(xk|y1:k−1)dxk

=

∫

p(xk|y1:k−1)∑

yk

ykp(yk|xk)dxk =

∫

p(xk|y1:k−1) E(yk|xk)dxk

=∑

i

αi

|Bi|

∫

Bi

E(yk|xk)dxk =∑

i

αiai.

(3.21)

In the equation (3.21) the integral turns into a summation, because the measurement

has a discrete distribution HNaij

.

It is seen that the distributions of both the prior state and the measurement can be

formulated as the mixture of distributions. For the derivation of the covariances Pxxk


and Pyykthe general formula is derived next for the covariance matrix of the random

variable, which pdf is the mixture of distributions. Let the random variable u have

the pdf

pu(u) =∑

i

αipui(u),

where∑

i αi = 1. The following lemma is needed.

Lemma 3.4 [Computational formula for the covariance] Let u ∈ Rn be a random

variable. The covariance of u can be computed as

V(u) = E(uuT ) −[E(u)

][E(u)

]T.

P roof.

V(u) = E[(u− E(u))(u− E(u))T ]

= E[uuT − uE(u)T − E(u)uT +

[E(u)

][E(u)

]T ]

= E(uuT ) −[E(u)

][E(u)

]T. 2

Thus by using Lemma 3.4 the covariance matrix of u is

V (u) =∑

i

αi

∫

uuTpui(u)du− E(u) E(u)T

=∑

i

αi

(Pui

+ E(ui) E(ui)T − E(u) E(u)T

)

=∑

i

αi

[Pui

+ (E(ui) − E(u)T )(E(ui) − E(u)T )],

(3.22)

where Pui= V(ui).

By using the equation (3.22) the covariance matrix Pxxkis

Pxxk=∑

i

αi

(Pxi + (pi − xk)(pi − xk)

T). (3.23)

The matrix Pxi can be written as

Pxi =

[

P11xi P12

xi

P21xi P22

xi

]

,

where the matrix Pxi is the covariance matrix of the state in the cell Bi. To compute

the diagonal elements of the matrix, the following lemma is needed.


Lemma 3.5 Let x ∈ R be a random variable, which has a following uniform distri-

bution,

p(x) =

1

b−a, a ≤ x ≤ b,

0 , x < a ∨ x > b,

where a < b. Then the variance of x is

V(x) =(b − a)2

12.

P roof. By using Lemma 3.4 the variance of x can be computed as

V(x) = E(x2) − (E(x))2 =

∫ b

a

x2

b − adx − a + b

2

=b3 − a3

3(b − a)− a + b

2=

(b − a)2

12. 2

Thus the diagonal elements of Pxi are

P11xi =

∫

Bi1

(x1 − pi1)2

|Bi1|dx1 =

|Bi1|212

,

and similarly

P22xi =

|Bi2|212

,

where the subscripts 1 and 2 denote the first and second components of the random

vector x or the horizontal and vertical lines of the rectangle Bi, respectively. The

covariance between the components of x in the cell Bi is∫

Bi1

∫

Bi2

(x1 − pi1)(x2 − pi2)

|Bi1||Bi2|dx1dx2 =

|Bi1|(pi1 − pi1)|Bi2|(pi2 − pi2)

|Bi1||Bi2|= 0.

Hence the covariance of the state is the equation (3.23), where the diagonal matrix

Pxi = p|Bi1|2

12, |Bi2|2

12y.

The covariance of the state and the measurement is

Pxyk=∑

y

∫

(x − x)(y − y)T p(x, y)dx

=∑

y

∫

(x − x)(y − y)T p(y|x)∑

i

αi

|Bi|χBi

(x)dx

=∑

i

αi

|Bi|

∫

Bi

(x − x)∑

y

(y − y)T p(y|x)dx

=∑

i

αi

|Bi|

∫

Bi

(x − x) E[(y − y)T |x

]dx

=∑

i

αi(pi − x)(ai − y)T .

(3.24)


The time index k is omitted in equation (3.24) for simpler notations.

It is seen that the pdf of the measurement can be written as the following mixture,

p(yk) =∑

i

∫

Bi

p(yk|xk)

|Bi|dxk.

Because Lemma 3.5 holds also for the discrete random variables, equation (3.22) can

be applied to formulate the covariance of the measurement, that is

Pyyk=∑

i

αi

(Ri + (ai − yT

k )(ai − yk)T),

where

Ri = pσ2iNy1

, . . . , σ2iNyn

y,

because the samples from different APs are assumed to be independent.

It is interesting to note that if the Gaussian approximation is done to the RSSI his-

tograms, that is

HNaij

≈ Naij

σ2

ij

(x),

the BLUE does not change. That is because the mean and variance of the histograms

do not change.

Applying non-linear Kalman filter to the discrete modeling of the state

In this work also the discrete modeling of the state was examined and tested in Section

4.5.2. Modeling the state as a discrete random variable means that the state x or

location is defined only at the CPs, which changes the computation of the unknown

quantities of the non-linear Kalman filter studied in the previous section.

The computation of the αi is different in the discrete than in the continuous modeling

of the state, that is

αi =N

xk−1

Σk−1(pi)|Bi|

∑

i Nxk−1

Σk−1(pi)|Bi|

.

The computation of the expected values xk and yk do not change from the continuous

case; the expected values of the state and measurement are directly the weighted

sums according the equations (3.20) and (3.21). Also the quantities Pxykand Pyyk


have similar expressions as in the continuous case. The only difference is with the

covariance Pxxk, which formulation can be done by using the equation (3.22), that is

Pxxk=∑

i

αi(pi − x)(pi − x)T ,

because the covariance matrix Pxi = 0. Thus the formulation of the covariance matrix

Pxxkis much simpler, when only the CPs are considered.

Chapter 4

Implementations and Results

The algorithms and methods discussed in Chapters 2 and 3 were implemented and

tested in the Wireless Local Area Network (WLAN) which is discussed in Section 4.1.

The structure of implementation is presented in Appendix A. It considers the design

of the Matlab codes, the data structures and the software used. In Sections 4.2 and

4.3 the collection of the radio maps and the test data is explained. In Section 4.4

the test results of the static location estimation algorithms are presented. Section 4.4

also covers the impact of several environment parameters on the performance of the

algorithms. In Section 4.5 the filters are applied to the test data. By the test data is

meant the measurements collected at the location estimation phase with a MacBook

laptop including the AirPort network card.

4.1 Wireless local area network and IEEE 802.11

standard

Most indoor localization methods rely on measurements of the radio signals used in

wireless local area networks. WLAN infrastructure consists of components that are

connected to the network. Components needed in the localization are access points

which communicate with the wireless devices. These wireless devices are normally

laptops, but any mobile device with the wireless network capability is acceptable. In

this work a laptop was used as the mobile measurement unit. The MUs interact with

each other using Wireless Network Interface Controllers (WNICs), i.e. network cards,

and IEEE (the Institute of Electrical and Electronics Engineers) 802.11 protocol family.

Terms IEEE 802.11 and Wi-Fi are often used interchangeably. APs transmit and

receive data for wireless devices to communicate with by using radio frequencies. APs

55

CHAPTER 4. IMPLEMENTATIONS AND RESULTS 56

are designed not to interfere with each other; APs close to each other are programmed

to operate on different channels. In practice, the areas covered by APs are overlapping

which can bring out interference [Saha et al. 2003].

APs periodically transmit beacon frames which include basic information about the

AP. MUs use this so called basic service set (BSS) to determine which particular

AP to communicate with. BSS carries information about the link quality which can

be derived from the received signal strength and the background noise. MU sweeps

periodically from channel to channel to determine the AP with the best link quality.

Thus the signal strengths of all APs in range can be determined. [Wallbaum and

Spaniol 2006]

IEEE 802.11 standard provides two spectrum bands, 5 GHz and 2.4 GHz bands, but

the latter is the most utilized and the only ISM (Industrial, Scientific and Medical)

-accepted radio band worldwide. WLAN offers ubiquitous coverage in large areas.

Thus WLAN can be used in indoor positioning systems, though it is not originally

designed for localization. [Kotanen et al. 2003]

Received Signal Strength Indicator (RSSI) is widely used and the most common type

of measurement in indoor localization and it is used in this work too. Network cards

do not need any additional hardware for reporting RSSI.

Information received by MU include the time-stamp which is synchronized with the

timer in the APs. However, the timer resolution is only 1 µs which is too inaccurate

for time -based positioning (TOA, TDOA) [Kotanen et al. 2003].

4.2 Radio maps

The radio maps were collected in an office area of the university. The grid cells are

shown in Figure 4.1 and the CPs are at the center of the cells. Figure 4.2 shows

the average RSSI values from one AP at the different calibration points and it shows

how the RSSI values varied through the grid. The division of the cells was done by

exploiting the floor plan; walls, doors and stairways were used as boundaries. Thus

the cells have dissimilar areas, because the design of the grid was done according to

the floor plan.

Four radio maps were constructed to test the effect of the density of the radio map,

orientation of the MU and the duration of the calibration time at the each CP. The

radio map 301 was collected at 40 CPs, with fixed orientation and 30 s measurement

period. The radio map 60r1 was collected at the same CPs, but with 60 s measurement


period and by rotating (denoted by r) at the CPs. The rotation was done to minimize

the effect of the orientation of the user and the calibration time was lengthened to

increase the reliability of the fingerprint. The radio map sparse was constructed by

removing every second entry from the radio map 60r1. The radio map 60r2 was also

done with the 60 s measurement period and by rotating the MU, but the density of

the radio map was increased to 77 CPs. The subscripts 1 and 2 denote the sparser

and the denser radio map, respectively (Figure 4.1). The fingerprints were measured

from 20 APs. 11 of them located on the same floor as the calibration operation, 5 on

the lower floor and 4 on the upper floor.

Calibration point

67 m

(a) (b)

Figure 4.1: Two radio maps 60r1 (a) and 60r2 (b) with the different densities.

The idea in the location fingerprinting is to create the radio map which stores the

unpredictable nature of the RSSI variation over the test bed. Figure 4.2 illustrates

how the signal strength is influenced by the obstacles in the floor plan and thus makes

it hard to fit any model to that. In some closed region, like corridors, the RSSI can

be more predictable, but such obstacles as humans, doors, windows, walls etc. have

significant impact on the signal propagation. In Figure 4.2 signal strength is also

propagated through the windows to the opposite corridor. In Figure 4.3 the AP is

located on the upper floor from the floor of the test bed and the radio signal is blocked

by the ceiling. The CP with the strongest signal strength is far from the AP, because

the signal propagates through the stairways. An RSSI measured on a different floor

to where the AP is located can be really unpredictable. The RSSI histograms were

measured at the CPs and the pattern of the histograms varied a lot. Figure 4.4(a)


APSignal strength

Glassdoor

Figure 4.2: Obstacles such as doors block the radio signal.

shows a histogram which is fairly a symmetric and nearly unimodal. However, Figure

4.4(b) illustrate how the histogram can be skewed or even multi-modal. These kinds

of histograms make it really hard to model the distribution of the RSSI at the different

locations.

4.3 Test data

Tracking the MU was tested by walking at the constant speed in the neighborhood of

the cells. The tracks are shown in Figure 4.6. Figure 4.5(b) shows the 1 s averages of

the RSSI at the first track. The RSSI -curve is not solid, because the considered AP

was not heard at all times. The correlation between the distance to the AP and the

RSSI can be seen. Especially at the locations with the smallest distance to the AP

the correlation is clear, because the AP was located at the same floor as MU.

The measurements were collected by the same MU which collected the radio maps.

The raw data was preprocessed later on and converted to the Matlab structures using

the Matlab functions described in Section A. During the preprocessing the weakest


APSignal strength

Stairway

Figure 4.3: Strongest signal strength is not necessarily measured at the closest CP.

−80 −550

0.14

RSSI

Relat.freq.

(a)

−92 −67

(b)

Figure 4.4: Normalized histograms can be skewed and multi-modal.

signals of the test data were ignored to increase the reliability of the measurement.

RSSI values weaker than -95 were ignored from the raw data. The same 20 AP set

was used as in the calibration phase. The number of APs heard at the time steps

varied from 2 to 15 in the test data. Figure 4.5(a) illustrates how the most common

number of APs heard was 9 APs and at least 2 APs were heard at every time step in

the tracks.


2 150

0.16

Number of APs heard / time step

Relat.freq.

(a)

0 1600

70

d / m

0 160

−90

−40

Time index

RSSI

(b)

Figure 4.5: The number of APs heard per time step during the location estimation phase (a)

and the correlation between the RSSI and the distance to the AP (b).

4.4 Static location estimation algorithms

The different methods described in Chapter 2 were tested. Only the real data was used

which brings out additional parameters to tune compared to if only simulated data

was used. Most of the tests were run by using the four tracks described in Section 4.3.

For the KNN method also the stationary test case was used. Unless otherwise stated,

the optimal test bed was used to find the best performance of the algorithms, that

is, the dense radio map 60r2 with all the available APs. The location estimate was

inferred by using the samples received at the 1 s time interval. Most of the methods

use each of the received RSSI samples individually, but some methods use averaged

values.

The mean of the norms of the 2-dimensional error vectors, denoted as ME, was used as

the primary performance measure. If the solver was not able to produce an estimate

at some time step, the estimate from the previous time step was used as an estimate.

The initial state was set to be the location of the strongest AP heard at the first time

step. By strongest AP it is meant the AP which transmitted the strongest sample

among all samples in the 1 s time interval.

As discussed in Chapter 2, the location estimation methods can be divided into the

deterministic and the probabilistic algorithms. In the probabilistic approach, the

different methods to compute the likelihoods at the grid cells Bi are studied. Some of

the methods interpret the normalized RSSI histogram as the distribution of the RSSI.

In those methods the RSSI samples are assumed to be independent and identically

distributed (i.i.d.) at the cell Bi, when the likelihood at the cell Bi is computed


STARTEND

(a)

START

END

(b)

END

START

(c)

ENDSTART

(d)

Figure 4.6: Test data includes the four tracks (a) - (d).

according to Equation (2.16), by multiplying the individual likelihoods corresponding

different samples from various APs. Thus each individual sample in the test data was

used to obtain the own likelihood value.


4.4.1 Effect of parameters in algorithms

In this section the static location estimation algorithms studied in this work are exam-

ined. Each of the location estimation algorithms has several parameters which were

varied to find the best performance. Varying different parameters of the test bed is

also examined in this section.

The mean error was used as the primary performance measure, but in the tables also

the root mean squared error (RMSE), median and the 25th, 50th and 95th percentiles

of the error are reported.

K-nearest neighbor

The K-nearest neighbor method and its modifications were tested. Different numbers

of CPs were considered in the computation of the estimate by varying the parameter

K and the location estimate was computed as the average of the coordinates of the K

”nearest” CPs. Different measures were used to compute the distances in the signal

space. The effect of varying the parameter K in KNN method is examined first.

CALIBRATION POINTS

TEST POINT A

TEST POINT B

Figure 4.7: The test points for the stationary MU.

Note that the parameter value K = 1 equals the NN method which KNN method is

compared to. It is examined whether the consideration of multiple ”nearest” neighbors

has any benefit to the performance. Locating the stationary MU is considered first.

Two test cases for the stationary MU were considered. In each of the cases, one of the

fingerprints was chosen as a test data and that fingerprint was removed from the radio

map. These two chosen stationary test cases, test points A and B, are shown in Figure

4.7. The test point A is located near the CP. From Figure 4.8(a) can be seen that if


the MU is located near the CP, KNN method only corrupts the estimate by removing

the estimate away from the CP which is near MU. However, for the stationary MU,

1 103

15

K

ME/m

Test point ATest point BTracks

(a)

1 2 3 4 5

5.5

6.4

Type of norm∞

(b)

Figure 4.8: Varying number of neighbors (a) and distance measure (b) in KNN method.

considering multiple ”nearest” neighbors during the location estimation phase can

perform better than considering just the ”nearest” CP in the signal space. The test

point B is not near any CP. Thus the KNN with small values of K can improve the

estimate compared to the NN estimate by averaging the coordinates of CPs around

the MU. The large values of K corrupt the estimate, because the CPs far from the

MU remove the estimate away from the true position. This is illustrated in Figure

4.8(a).

Varying the parameter K in tracking the moving MU is considered next. From the

stationary case can be concluded that, if the true position is near CP, the NN method

can perform better than KNN method. Thus the best method depends on the test data

used which should be considered in the design of the location estimation algorithm. In

this work, the MU was moved through the CPs and considering multiple CPs did not

improve the estimate (Figure 4.8(a)). Thus the NN method is used in the following

tests against other methods as a representative of the KNN method.

The impact of the distance metric used is considered next. The parameter p was

varied in the p-norm distance measure in the NN method. ∞-norm was also tested.

Figure 4.8(b) illustrates how the small values of p give the smallest ME and it also

outperforms the infinity norm.

Prasithsangaree et al. [2002] propose that the sample size or the standard deviation of

the fingerprint can be interpreted as the measure of the reliability of that fingerprint.

Thus they used those quantities as weights in the computations of the generalized

p-norm. However, they did not obtain any improvement with that weighting scheme.


In this work the direct weighting of the distances with the inverse of the sample size

improved the results slightly; the ME decreased from 5.6 m to 5.4 m. The test was

done with the NN method and 1-norm as a distance measure.

Histogram method

The histogram method is a probabilistic method, where the likelihood in the grid cell

Bi is obtained from the RSSI histograms measured at the ith calibration point. The

RSSI histograms were normalized, when they could be interpreted as the distributions

of the measurement noise. The binwidth of the histograms was varied. Figure 4.9(a)

illustrates how increasing of the bin width simplifies the histograms and levels out

minor variations. The results for the variation of binwidth are shown in Figure 4.9(b).

The binwidth b = 1 gave the smallest ME, when xMEAN was used as an estimator, and

that bin width value was used in the following tests. Roos et al. [2002] apply Bayes’

−77 −560

0.16

RSSI

Probab.

Bin width = 1Bin width = 2Bin width = 3

(a)

1 5

5.6

6.1

Bin width

ME/m

MeanMap

(b)

Figure 4.9: Histograms with different bin widths (a) and the effect of bin width in the

histogram method (b).

theorem and add uniform distribution to all bins as a prior distribution. In this work,

using the uniform prior improved the ME of the posterior mean significantly from 13.5

m to 5.6 m, and it was used in the following tests with the bin width b = 1 .

Histogram comparison method

The samples received at the 1 s intervals in the test data were arrayed on the histograms

and compared to the fingerprint histograms. Histograms were normalized, so the

distance measures between probability density functions could be used to measure

the similarity between the histograms. Figure 4.10 illustrates that the Simandl and


the Bhattacharayya distances gave the smallest ME. Thus the Simandl distance was

chosen for the following tests as a histogram comparison measure.

K−L K−S L−F Infnorm Bhattacharayya Simandl

6

12

Measure

ME / m

MeanMap

Figure 4.10: Varying distance measure in the histogram comparison method

Kernel method

The kernel method was tested with the several type of kernel functions and kernel

widths. The kernel functions were applied to the normalized fingerprint histograms

with the histogram bin width b = 1. Figure 4.11 illustrates the non-parametric kernel

approximation of the histogram with the different kernel functions and kernel widths.

The Figure illustrates how the kernel fit smoothes the histogram and gives non-zero

probabilities to the bins with zero probability.

Different kernel functions were tested with the kernel width h = 2. Kushki et al.

[2005] obtained the best results by using exponential kernel function (2.31). The same

result was observed in this work; Figure 4.12(a) illustrates how the exponential kernel

function had the best performance compared to the other kernel functions tested in

this work. Thus the exponential kernel function with the kernel width h = 2 was used

in the following tests as a kernel method.


−90 −650

0.12

RSSI

Probability

RSSI histogramGaussian kernel, width=0.8Exponential kernel, width=0.8Exponential kernel, width=2

Figure 4.11: Non-parametric kernel approximation of the RSSI histogram with different

kernel functions and kernel widths

Epanechn. Normal Expon. Triangle Box5.4

6.0

Kernel function

ME/m

MeanMap

(a)

1 8Kernel width

(b)

Figure 4.12: The effects of the kernel function (a) and the kernel width (b).

Parametric approximation of measurement noise

As discussed in Section 2.5.3, the pattern of the RSSI histograms vary a lot and

the parametric approximation of the histograms is challenging. In this work several

functions were tested to approximate the distribution of the RSSI.

Kaemarungsi [2006] propose the approximation of the RSSI histogram with the log-

normal distribution. The motivation for the log-normal approximation comes from


the left skewed histograms which suggest that the RSSI has a left-skewed distribution.

Because log-normal pdf is a right-skewed function, the absolute RSSI values were used.

Figure 4.13(a) illustrates how the log-normal distribution can provide better fit than

the Gaussian approximation. The Gaussian approximation of the histograms has

55 950

0.1

| RSSI |

Probab.

Gaussian fitLog−normal fitRSSI histogram

(a)

0x

f(x)

GaussianInverseExponentialHistogram

(b)

Figure 4.13: Log-normal vs Gaussian approximation to the histogram (a). Scaled functions

to illustrate the patterns of different parametric approximations (b).

Log−normal Gaussian Gaussian(secavg) Exponential Inverse

5

9

Pdf of measurement noise

ME / m

MeanMap

Figure 4.14: Different parametric approximations.

been widely discussed in the literature [Kaemarungsi 2006],[Li et al. 2006], [Haeberlen

et al. 2004] and it is tested also in this work. Gaussian approximation can provide

good fit with some of the histograms, but in the case of skewed histograms fit can


be bad. Moreover, some of the histograms are quite concentrated around its mean,

when the Gaussian approximation tends to spread the distribution too much (Figure

4.13(b)). In those cases the exponential function provides better approximation and

improves the resolution of the different fingerprints close to each other.

Likelihood was computed by using different parametric approximations and Figure

4.14 shows that the exponential approximation produced smallest mean error. The

label secavg stands for the 1 s average; in that test the mean of the samples received

at the 1 s period of the test data was used as a measurement, as discussed in Section

2.5.3.

In this work, the division of the variance by the number of samples was not done in

the secavg approach, as it is stated in Theorem 2.10. This was because it caused the

problem of the too small variance. Thus the variance was approximated upward. It

is interesting that the difference in the mean error between the 1 s average and the

every sample approach is minor. Computationally using 1 s averages is faster than

using each sample individually.

Weighted K-nearest neighbor

In the weighted K-nearest neighbor approach, the 1 s averages were used as a mea-

surement and different weight functions were tested. Li et al. [2006] suggest using the

inverse of the distance used in the KNN method to compute the weights. To test that,

1 s averages were used as test data and the distance in the signal space was computed

by using the 1-norm used in the KNN method. From the probabilistic point of view,

the computed similarity measure was used to infer the likelihood for the cell Bi, as

discussed in Section 2.6.

Another tested method to compute the likelihood was by using the inverse of the

Mahalanobis distance (Definition 2.3). The inverse of the 1-norm and the inverse of

the Mahalanobis distance can provide quite good map estimate, mean errors were 5.6

m and 6.1 m, respectively. However, the mean estimator of the distribution does not

give such good results, mean errors were 38.0 m and 25.2 m, respectively. The map

estimate equals actually the NN estimate with the 1-norm and Mahalanobis distance

as a distance measure. Because 1-norm still beats the Mahalanobis distance, it is

still used in the following tests as a distance measure. The inverse function does not

approach zero fast enough in the tails which indicate too large likelihood values for

the measurements which differs a lot from the fingerprint. Thus the xMEAN estimator

infers large ME.


4.4.2 Radio map density

The density of the radio map was varied by changing the number of CPs considered

during the location estimation phase; the radio maps sparse, 60r1 and 60r2 were used.

All radio maps have fingerprints collected by rotating at the CPs, so that the density

is the only difference between the radio maps. The posterior mean was used as an

estimator. As expected, the denser radio map, the better the results in every tested

Histogram comparison Histogram NN Parametric Kernel

5.5

11

Method

ME / m

’Sparse’’60r1’’60r2’

Figure 4.15: Effect of radio map density

method which can be seen from Figure 4.15. However, the improvement between the

radio maps sparse and 60r1 is larger than the improvement between the radio maps

60r1 to 60r2. Thus the improvement in the results does not grow linearly as a function

of the radio map density. Histogram and histogram comparison methods benefit the

most from the refinement of the radio map.

4.4.3 Single orientation vs varying orientation

In this section is tested whether the orientation of the user during the calibration

phase affects the location estimation phase. The fingerprint data was collected either

by holding the MU to one orientation or by rotating it with the user, when the user

was rotating around his axis. The purpose of the rotation was to level out or equalize

the impact of the MU’s orientation to measure more reliable fingerprint compared to



7

12

Method

ME / m

StillRotating

Figure 4.16: Effect of orientation

the fingerprint which is measured only to one direction. To test the impact of the

orientation, the density and the calibration time of the radio maps were held constant,

so that the existence of the rotation was the only variable. The first radio map used

was 60r1, but the calibration time was parsed to 30 s and the other radio map was

301. The posterior mean was used as an estimator.

Figure 4.16 shows that all tested methods benefit significantly from the rotation of

the MU during the calibration phase. Especially histogram comparison and histogram

methods gain a lot from the rotation. The kernel and the parametric methods can

obtain smaller ME compared to the NN method which has the smallest ME compared

to other methods, when the rotation is not present.

4.4.4 Calibration time

In this section the effect of varying the calibration time at each of the CPs is examined.

The posterior mean was used as an estimator. Figure 4.17 shows the different methods

as a function of the calibration time. The histogram method needs a lot of data to

give reliable likelihood and the prolonging of the calibration time has a benefit up to

about 30 s. However, histogram method reaches its minimum ME with the maximum


60 s calibration time. The histogram comparison method reaches its best performance

already in about 10 s.

Parametric, NN and kernel methods perform well even with the short calibration time.

The largest drop in the ME occurs between 1 s and 2 s. The best method up to 10

s varies among these three methods. When the more data is collected, the kernel

method shows the best performance. This can be seen already after 10 s. Parametric,

NN and kernel methods perform well, because they fill the incomplete fingerprint data

in a different ways.

1 10 60

5.5

8

Calibration time / s

ME / m

HistogramKernelParametricHistogram comparisonNN

Figure 4.17: Effect of calibration time

4.4.5 Number of access points in the test data

As discussed in Section 4.3, in this work the set of 20 APs were used. In this section

the effect of varying the number of APs used in the test data is examined. This can

be the case for example during the AP failure after the calibration phase. In the test

bed used, the range of any of the 20 APs does not cover the whole test area. Thus

to test the effect of number of available APs, the strongest APs were used in the test

data. The posterior mean was used as an estimator.

Figure 4.18 illustrates that all tested methods benefit from increasing the number of

available APs. However, the ME is not a linear function of number of available APs


1 10 Max

5

10

18

Number of APs

ME / m

HistogramKernelParametricHistogram comparisonNN

Figure 4.18: Effect of number of APs in the test data

and thus after five APs there are no dramatic change in the performance. The order

of the different methods according the ME stays almost the same with all number of

APs and the kernel method has the smallest ME throughout the test. Other methods

outperform NN method, when only 1 AP is available.

4.4.6 Test data from different access points than calibration

data

In this work a certain number of samples is used to infer the location estimate. These

samples are from different APs. At the time step k only the fingerprints which satisfy

RSk = Ri |Ny ⊆ Ni (4.1)

are considered in the location estimation phase. The equation 4.1 means that the

fingerprint has to include measurements from the same APs as the test data, otherwise

the likelihood in that grid cell is zero in the probabilistic methods, or the CP is ignored

in the deterministic methods. Could the samples in the fingerprint set

RNk = Ri |Ri /∈ RS

k

be exploited somehow? In this section this question is examined.


Several approaches to solve this problem was tested. In the first approach only the

measurements from the common APs was tested, but it did not produce good results.

The other effort was to set a penalty to the fingerprints in RNk . Finding the correct

penalty was hard and in this work the way to improve the performance was not found

with this approach either.

Filling the missing samples in the fingerprints in the set RNk was tested next. At first

a small constant was set as a missing value. The constant was chosen to be 1 unit

smaller than any other really received sample, -96. The other approach was to exploit

the path-loss model in the generation of the measurements from the missing APs. The

path-loss model is

P = P0 − 10γ log(d),

where P0 is the received power at the 1 m distance from the transmitter, the parameter

γ is a path-loss exponent and d is a distance between the transmitter and the receiver.

The path-loss exponent was set to γ = 3 and the constant P0 = −30. The distance d is

known, that is the distance between the fingerprint and the AP. Figure 4.19 shows that


5.4

7.2

Method

ME / m

Not added FPsAdded constant FPAdded with path loss model

Figure 4.19: Effect of the generation of the fingerprints

both the constant value and the path-loss model approaches improved the estimates

slightly, constant value approach little bit more. At some time steps the static location

estimation algorithms were not able to produce a location estimate at all, because any

fingerprint from RSk was not found. Thus filling the missing values was very important

at these time steps to infer the location estimate.


4.4.7 Summary of static location estimation algorithms

The impact of the variation of the different parameters is examined in the previous

sections. In this section the algorithms are compared against each other by using

the best parameter values found in this work. The numerical comparison is shown in

Tables 4.1 and 4.2 for the radio maps 60r1 and 60r2, respectively.

As already discussed, the rifenement of the radio map increased the performance which

can be also seen from the numerical results. Probabilistic histogram, kernel and para-

metric methods performed slightly better than the deterministic NN method.

The histogram comparison method has the largest ME among other tested methods

with both radio maps and the kernel method has slightly the smallest ME. The mean is

a better estimator than the map in all probabilistic methods, except for the histogram

comparison method which has smaller ME for the map estimator. The kernel method

resulted in the smallest ME for both radio maps.

For the full 20 APs set available, the three probabilistic methods do not have significant

difference compared to each other. However as discussed, the kernel and parametric

methods can obtain significantly better results compared to histogram method with

the reduced calibration time.

Table 4.1: Summary of the performance of the static estimators, radio map 60r2

Method ME Median RMSE Max 25th 75th 95th

Histogram, mean 5.5 4.3 8.4 104.5 2.1 7.2 14.2

Histogram, map 5.7 4.3 8.4 106.1 2.0 7.4 14.5

Kernel, mean 5.4 4.1 8.9 98.6 2.0 6.9 12.3

Kernel, map 5.7 4.3 8.9 119.6 2.0 7.0 12.9

Parametric, mean 5.5 4.3 8.7 106.3 2.1 7.0 13.1

Parametric, map 5.6 4.3 8.7 119.6 2.1 7.2 13.6

Histogram comp, mean 7.0 5.0 11.0 106.0 2.6 8.6 19.4

Histogram comp, map 6.6 4.5 11.0 119.6 2.2 8.2 18.9

NN 5.6 4.4 8.8 119.6 2.1 7.1 13.7


Table 4.2: Summary of the performance of the static estimators with the radio map 60r1


Histogram, mean 8.0 6.0 12.0 143.2 3.1 10.2 19.4

Histogram, map 8.2 6.1 12.0 143.2 3.1 10.4 19.8

Kernel, mean 6.9 5.7 9.3 79.6 2.8 9.4 16.3

Kernel, map 7.0 5.8 9.3 119.6 2.9 9.4 16.3

Parametric, mean 7.2 5.7 10.4 106.2 2.8 9.4 16.8

Parametric, map 7.3 5.8 10.4 119.6 2.9 9.4 17.3



NN 7.2 5.8 5.8 122.1 3.0 9.5 16.6

4.5 Filtering algorithms

In Section 4.4 the static estimation algorithms are examined. In this section the

objective is to compute the location estimate in the time series which leads to the

filtering approach. In Section 4.5.1 the different state models are applied to the static

location estimation algorithm. In Section 4.5.2, the generalized form of the BLU

estimator is used to test the non-linear Kalman filter algorithm.

4.5.1 Applying state models to the static algorithms

In this section different state models are applied to the static location estimation

algorithms to improve and smooth the estimates of the static location estimation

algorithms.

The graph model discussed in Section 3.2 was tested first to compute the piece-wise

constant prior distribution from the posterior distribution of the previous time step.

The graph included the connections between the grid cells.

The stationary and constant velocity state model discussed in Section 3.2 were used

in the linear Kalman filter approach. The mean estimate from the static location esti-

mation algorithm was used as a measurement for the Kalman filter. Different methods

to compute the static estimate was tested. However, the covariance matrix of the es-

timate from the static location estimation algorithm was often small which prevented

the smoothning of the estimates from the Kalman filter. Thus the measurement er-


ror covariance matrix Rk of the Kalman filter was increased, when also the posterior

covariance matrix increased. The matrix

A =

[

4 0

0 4

]

was added to the covariance matrix of the static location estimation algorithm to

obtain the matrix Rk. The time step in the tests was 1 s, thus the state transition

matrix for the stationary state model was

FS =

[

1 0

0 1

]

and the state noise covariance matrix was

QS =

[

8.3 0

0 8.3

]

.

In the constant velocity model also the 2-dimensional velocity was in the state and

the state transition matrix was

FCV =

1 0 1 0

0 1 0 1

0 0 1 0

0 0 0 1

The state noise covariance for the constant velocity model is given in Eq. (3.4) where

σ2C = 2 was used as a parameter.

Table 4.3: Applying graph state model to the static algorithms with the radio map 60r2


Histogram, mean 4.9 4.2 6.6 43.2 1.9 6.8 12.2

Histogram, map 5.0 4.2 6.6 43.2 2.0 6.9 12.2

Kernel, mean 4.6 3.9 5.7 19.9 1.9 6.4 11.0

Kernel, map 4.7 4.1 5.7 20.0 1.9 6.6 11.3

Parametric, mean 4.8 4.2 6.0 24.9 2.0 6.7 11.4

Parametric, map 4.9 4.2 6.0 25.1 2.0 6.8 11.4



NN 4.6 3.9 5.8 21.4 2.0 6.4 11.4

The effect of the filtering approach is shown in Figure 4.20 and it shows that applying

the state model into the static location estimation algorithm improves the results



4.5

7

Method

ME / m

Mean, staticMap, staticMean, graph state modelMap, graph state modelKF, stationaryKF, constant velocity

Figure 4.20: Effect of filtering approach

Table 4.4: Applying stationary state model to Kalman filter with the radio map 60r2


Histogram 4.8 4.0 6.2 39.2 1.9 6.7 11.7

Kernel 4.5 3.8 5.5 20.2 2.0 6.2 10.8

Parametric 4.9 4.0 6.8 87.7 1.9 6.7 11.2

Histogram comp 5.7 4.4 7.7 55.9 2.2 8.2 14.5

NN 5.1 3.9 7.8 90.6 1.9 6.5 12.0

clearly. Filtering approach smoothes the path of the estimates and reduces all reported

errors. Especially the reduction in the maximum error is significant in the filtering

approach. The numerical results can be seen in Tables 4.3, 4.4 and 4.5.

4.5.2 Non-linear Kalman filter

Non-linear Kalman filter or BLU estimator was tested with the stationary state model

by modeling the state first as a discrete random variable and then as a continuous


Table 4.5: Applying constant velocity model to Kalman filter with the radio map 60r2


Histogram 5.0 4.1 6.4 39.8 2.2 7.0 11.9

Kernel 4.7 4.0 5.8 22.4 2.1 6.4 11.1

Parametric 5.1 4.2 7.1 91.8 2.1 6.9 12.0

Histogram comp 6.2 4.7 8.6 76.7 2.4 8.4 16.1

NN 5.2 4.1 7.8 82.0 2.0 6.8 12.6

Table 4.6: Applying graph state model with the radio map 60r1


Histogram, mean 6.7 5.8 8.2 30.8 2.8 8.5 14.3

Histogram, map 6.8 5.8 8.2 32.1 2.8 8.6 14.5

Kernel, mean 6.3 5.5 7.7 26.1 2.9 8.9 14.8

Kernel, map 6.4 5.6 7.7 26.3 3.0 9.1 15.0

Parametric, mean 6.5 5.7 8.0 26.3 3.0 9.0 14.9

Parametric, map 6.6 5.8 8.0 26.3 3.0 9.1 15.2



NN 6.4 5.6 7.8 30.2 2.9 9.0 14.9

random variable. The state transition matrix was the same as in the Kalman filter

approach, the matrix FS and the state noise covariance matrix was

QS =

[

20 8

8 20

]

.

Problems with the numerical computations arose during the tests. The corridors

produced problems, because there the CPs lie on the same line and thus the MSE

matrix of the state was easily close to singular. The other issue were the weights

αi in equation (3.17) which defines the prior state. If the MSE matrix was small at

some time instant tk−1 and the estimate xk−1 was far from the true location at the

time instant tk, the weights tend to be too close to zero near the true location. Then

the BLU estimator could not produce an estimate, because there were not enough

fingerprints available. Thus the state noise covariance matrix had to be large enough

to guarantee that the weights would not all be equal to zero, even if the estimate

jumped far from the true location.


Table 4.7: Applying stationary state model to Kalman filter with the radio map 60r1


Histogram 6.9 5.7 9.6 104.5 2.7 9.5 16.6

Kernel 6.2 5.2 7.7 40.4 2.7 8.8 14.5

Parametric 6.4 5.4 8.2 51.4 2.7 8.9 15.0

Histogram comp 7.4 5.9 10.3 80.5 2.7 10.3 18.2

NN 6.6 5.6 9.3 93.4 2.8 9.0 15.1

Table 4.8: Applying constant velocity model to Kalman filter with the radio map 60r1


Histogram 7.4 5.8 10.3 100.0 3.1 9.8 17.7

Kernel 6.5 5.6 8.2 43.9 2.8 9.1 15.4

Parametric 6.7 5.6 8.6 51.4 2.9 9.4 15.8

Histogram comp 8.2 5.9 12.5 145.3 3.1 11.1 20.9

NN 6.8 5.6 9.6 98.8 3.0 9.1 15.6

The BLU estimator described in this work needs a lot of fingerprint data around the

true location to solve the state properly. However, the normal office environment in-

cludes lot of corridors, rooms and forbidden areas. Thus the true location is often

surrounded by areas which does not contain fingerprint data. Only the fingerprints in

RSk were consider at time instant tk which caused also problems. Figure 4.3 illustrates

how the APs can be heard even at the other side of the building; the signal can propa-

gate for example through windows and stairways. Then the set of CPs corresponding

to the set RSk can be strongly spread and it can include some CPs which are far from

the true location of the MU at the time step tk. This implicates sparse fingerprint

database and problems during the computations in the unknown quantities of the BLU

estimator.

In Table 4.9 the subscript 1 denotes the discrete and subscript 2 continuous modeling

of the state. Tables 4.9 and 4.10 show that modeling the state as a discrete random

variable produced better results with both radio maps. The good performance of the

BLUE1 is shown especially with the radio map 60r1 where results were better than

with the other filters.

The BLUE1 with the radio map 60r2 was examined with various number of APs.

Figure 4.21 illustrates how the ME diminished rapidly, when the number of APs was


Table 4.9: Non-linear Kalman filter with the radio map 60r2


BLUE1 5.5 4.0 7.8 46.6 2.3 7.1 14.3

BLUE2 15.9 8.3 28.3 125.0 4.3 15.3 53.5

Table 4.10: Non-linear Kalman filter with the radio map 60r1


BLUE1 5.8 5.2 6.9 23.9 3.1 7.8 13.0

BLUE2 21.8 11.5 38.2 248.0 7.2 22.3 47.8

1 10 Max

5

15

Number of APs

ME / m

BLUEKernel, static

1

Figure 4.21: BLUE, number of APs varied

increased. The good performance compared to static location estimation algorithm

is illustrated in Figure 4.21 where the comparison is done against the static location

estimation algorithm with the kernel approximation. BLUE1 was more sensitive to

the variation of the number of APs, but showed better performance in several number

of APs. The smallest ME was achieved with the 7 APs, when ME was 4.9 m.

Chapter 5

Conclusions and Future Work

In this thesis different location fingerprinting methods were compared by introducing

the mathematical basis of the methods and by doing wide range of tests with the

real data. Mathematical formulation was done from different point of views and the

parameters of the methods were varied in the tests to find the best performance. The

environmental variables, such as the number of access points (APs) and the radio

map density, were also varied and the methods were compared also in these varying

circumstances.

The main goal in the mathematical formulation was to model the location as the con-

tinuous random variable, which leads to the division of the area of interest into the

rectangular cells instead of considering only the individual calibration points. However,

deterministic methods, such as K-nearest neighbor (KNN) method and the weighted

K-nearest neighbor (WKNN) method, were formulated with the discrete location vari-

able, as they are presented in the literature.

The methods studied can be divided into two parts; namely the static location esti-

mation approach and the filtering approach. In the static approach the main idea is

to compute the likelihood function at the cells, although the Bayesian approach was

introduced with the uniform prior function. The likelihood was computed by various

81

CHAPTER 5. CONCLUSIONS AND FUTURE WORK 82

methods, most of them based on the measurement model and the computation of

the likelihood with known distribution of the measurement noise. The distribution of

the measurement noise was modeled to obey the Received Signal Strength Indicator

(RSSI) patterns or histograms measured at the calibration points. Different meth-

ods to compute the likelihood function yield different approximations to the RSSI

histograms. The kernel approximation of the RSSI histograms with the exponential

kernel function performed best in most of the tests; it filled the gaps in the histograms

and smoothed them.

Some methods, such as histogram comparison method, is not based on the measure-

ment model, but to the similarity of the fingerprint histograms and the test data

histograms. In these tests, the Simandl and the Bhattacharayya -distance measures

gave the smallest mean error.

The filtering approach could be divided into the three parts. The graph state model

exploits the known connections between the cells according to the floor plan. Apply-

ing graph state model into the static location estimation algorithms improved results

significantly. The linear Kalman Filter (KF) is the counterpart for the graph state

model and it is also based on the static algorithms; in the KF approach the location

estimate from the static estimator is given as a linear measurement to the Kalman

filter. The KF was tested with two state models, the stationary and the constant

velocity state models. Both of these state models provided better results than just

the static estimators alone, but the stationary state model outperformed the constant

velocity model. The KF with the stationary state model performed even better than

the graph state model, when the histogram and kernel methods were used as the static

estimation algorithms.

The third approach to the filtering was a non-linear Kalman filter, where the unknown

quantities of the BLU estimator are computed exactly from the fingerprints. The non-

linear Kalman filter was also studied in the case of discrete modeling of the state

and these two approaches were compared. The discrete modeling of the state gave

better results than the continuous modeling. Non-linear Kalman filter provided better

performance than the static estimation algorithms especially with the small number

of APs and the sparse radio map.

The radio map is worth good planning, The cells should be chosen to match well with

the floorplan, it is not rational that there is a wall going in the middle of the cell,

because then the fingerprint measured at the center of the cell hardly represents the

RSSI distribution in the cell. The proper guidelines for the adequate density of the

grid in the radio map is hard to tell, because it all depends on the floor plan and the

locations and number of available APs.

CHAPTER 5. CONCLUSIONS AND FUTURE WORK 83

In the future several additional tests can be made. The exploitation of the channel

information did not provide good results in this work; a lot of data is needed to provide

reliable radio map from different channels. In this work the calibration points were

chosen to be in the centers of the cells, more tests can be made to collect the data all

over from the cell to obtain perhaps more reliable signal characterization inside the

cells.

Different mobile devices develop all the time, when more usable measurements from

the different sources come available. Then the combination of the different type of

measurements to compute the location estimate, called as hybrid positioning, is also

worth of testing and developing in the future. The mathematical formulation of the

cells and the computation of the piecewise constant posterior enables easier combi-

nation with the other measurements; when multiple measurements are available, the

posterior obtained from the location fingerprinting algorithm can be interpreted as the

measurement likelihood for the hybrid system.

In this work the calibration and the location estimation measurements were collected

with the same measurement unit. In the future more tests are needed with the variety

of measurement devices.

Bibliography

Paramvir Bahl and Venkata N. Padmanabhan. Radar: An in-building rf-based user

location and tracking system. INFOCOM 2000. Nineteenth Annual Joint Conference

of the IEEE Computer and Communications Societies. Proceedings. IEEE, 2(10):

775–784, March 2000.

Y. Bar-Shalom and R. X. Li. Estimation with Applications to Estimation with Appli-

cations to Tracking and Navigation, Theory Algorithms and Software. John Wiley,

Sons Inc., 2001.

Richard O. Duda, Peter E. Hart, and David G. Stork. Pattern Classification. John

Wiley, Sons Inc., 2001.

Ekahau. URL http://www.ekahau.com/.

Andreas Haeberlen, Eliot Flannery, Andrew M. Ladd, Algis Rudys, Dan S. Wallach,

and Lydia E. Kavraki. Practical robust localization over large-scale 802.11 wireless

networks. Technical report, MobiCom’04, 2004.

J. Hightower and G. Borriello. Location systems for ubiquitous computing. IEEE

Computer, 1(34):57–66, August 2001.

Andrew H. Jazwinski. Stochastic Processes and Filtering Theory, volume 64. Academic

Press, 1970.

Simon J. Julier and Jeffrey K. Uhlmann. Unscented filtering and nonlinear estimation.

Proceedings of the IEEE, 92(3), March 2004.

K. Kaemarungsi and P. Krishnamurthy. Modeling of indoor positioning systems based

on modeling of indoor positioning systems based on location fingerprinting. IEEE

Computer, 7, 2004.

Kamol Kaemarungsi. Distribution of wlan received signal strength indication for in-

door location determination. Technical report, National Electronics and Computer

Technology Center, Thailand, 2006.

84

BIBLIOGRAPHY 85

Osmo Kaleva. Matemaattinen tilastotiede. Tampere university of technology, lecture

notes, 2008.

Petri Kontkanen, Petri Myllymaki, Teemu Roos, Henry Tirri, Kimmo Valtonen, and

Hannes Wettig. Topics in probabilistic location estimation in wireless networks.

Technical report, Complex Systems Computation Group, Helsinki Institute for In-

formation Techonology, University of Helsinki, Helsinki University of Technology,

2004.

Antti Kotanen, Marko Hannikainen, Helena Leppakoski, and Timo D. Hamalainen.

Positioning with ieee 802.11b wireless lan. The 14th IEEE International Symposium

on Personal, Indoor and Mobile Radio Communication Proceedings, 2003.

A. Kushki, K.N. Plataniotis, and A.N. Venetsanopoulos. Radio map fusion for indoor

positioning in wireless local area networks. The 7th International Conference on

Information Fusion, pages 1311–1318, 2005.

Andrew M. Ladd, Kostas E. Bekris, Algis Rudys, Guillame Marceau, Lydia E. Kavraki,

and Dan S. Wallach. Robotics-based location sensing using wireless ethernet. MO-

BICOM’02, 2002.

Binghao Li, James Salter, Adrew G. Dempster, and Chris Rizos. Indoor positioning

techniques based on wireless lan. Technical report, School of Surveying and Spatial

Information Systems, UNSW, Sydney, Australia, 2006.

Anvar Narzullaev, Youngwan Park, and Hoyoul Jung. Accurate signal strength pre-

diction based accurate signal strength prediction based positioning for indoor wlan

systems. Information and Communication and Engineering Department, Yengnam

University, 2008.

Armo Pohjavirta and Keijo Ruohonen. Laaja tilastomatematiikka. Tampere university

of technology, lecture notes, 2005.

P. Prasithsangaree, P. Krishnamurthy, and P.K. Chrysanthis. On indoor position loca-

tion with wireless lans. Technical report, Telecommunications Program, University

of Pittsburgh PA 15260, 2002.

Teemu Roos, Petri Myllymaki, Henry Tirri, Pauli Misikangas, and Juha Sievanen. A

probabilistic approach to wlan user location estimation. International Journal of

Wireless Information Networks, 9(3):155–163, July 2002.

Siddharta Saha, Kamalika Chauhuri, Dheeraj Sanghi, and Pravin Bhagwat. Location

determination of a mobile device using ieee 802.11b access point signals. Technical

report, Department of Computer Science and Engineering, 2003.

BIBLIOGRAPHY 86

Niilo Sirola. Mathematical Methods for Personal Positioning and Navigation. PhD

thesis, Tampere University of Technology, 2007.

Martin Vossiek, Leif Wiebking, Peter Gulden, Jan Wieghart, Clemens Hoffman, and

Patric Heide. Wireless local positioning, pages 77–86. IEEE Microwave magazine,

2003.

Michael Wallbaum and Otto Spaniol. Indoor positioning using wireless local area

networks. Technical report, RWTH Aachen University, Department of Computer

Science, 2006.

Wireshark. URL www.wireshark.org/.

Appendix A

Structure of implementation

Wireshark -software [Wireshark] was used to collect the radio map and the test data.

The Wireshark data can be exported in either comma separated values (csv) or packet

details markup language (pdml) file format. The advantage of the csv file format is

the small filesize, just about 400 KB for 1 minute calibration time. The disavantage is

that it does not preserve as much data as pdml -format, which stores also the channel

information of the received sample. In the tests both of these file formats were used.

The data from Wireshark was imported to the Matlab 7.4.0, where the processing of

the data continued.

In this section the main data structures and the Matlab functions used in this work

are described. The different options are implemented in the functions and they are

designed to provide a platform for more tests in the future.

A.1 Data structures

The radio map and the test data have their own structure. In location fingerprinting

the main computational task is to compare the measurements to the fingerprints. Thus

87

APPENDIX A. STRUCTURE OF IMPLEMENTATION 88

Text file

(csv or pdml -format)RSSI measurements

(Wireshark)

Importing to Matlab

(mat -format)

Radio map, Meas

(Matlab structures)

Matlab functions

parsefp

parsetrack

radiomapAdd

Matlab functions

doMatfiles

doCPs

Location estimation

Solvers

fpsolver

kfsolver

bluesolver

Figure A.1: The structure of implementation

it is rational to use as similar data structures as possible for the radio map and for

the test data. The radio map -structure is explained in Table A.1.

Table A.1: Radio map -structure. The superscript ∗ denotes that the information is not

necessarily available.

Field Explanation

Name Label of CP in radio map

Position 2-dimensional coordinates of CP

Vertices Vertices of cell

Area Area of cell

Values Mean or median of samples from different APs

Variances Sample variances of samples from different APs

APs Indexes of APs heard

Period Duration of measurement period

Time Time of day

Logmean Mean of logarithms of samples

Logvar Variance of logarithms of samples

Channels∗ Histogram of channels

Histograms Raw data from different APs stored as histograms

The measurements from different APs are separated, because the measurements from

an AP can be compared only to the fingerprints from the same AP. The radio map

includes measurements either from 30 s or 60 s time period. The location estimation

was done by using the sliding window of measurements, thus the Meas -structure (Table

A.2) includes quite similar fields as the Radiomap -structure. The RSSI histograms

are also collected into a structure, which is described in Table A.3. In addition to the

options described in Table A.3, the Proportion -field can be the proportion relative

to the largest frequency. As can be seen from Tables A.1 and A.3 the storage of the

data has several options, which can be set during the data parsing. The effects of most


Table A.2: Meas -structure

Field Explanation

Values Mean or median of samples from different APs

Variances Sample variances of samples from different APs

APs Indexes of APs heard

Rawdata Individual samples

Time Time of day

Channels∗ Histogram of channels

Histograms Raw data from different APs stored as histograms

Table A.3: Histogram -structure

Field Explanation

AP Index of access point

Values Different received RSSI values

Proportion Frequency or relative frequency

Sample size Number of samples received

Bin width Bin width of histogram

of these options are studied in Section 4.4. It is also possible to separate the received

samples according to the channel. Then the described data structures include data

only from the certain channel.

The data structure which includes the state variable is denoted as Solv -structure and

it is described in Table A.4. The field Options includes solver -specific options, which

are determined in the initialization of the solver.

Table A.4: Solv -structure

Field Explanation

State Value of state x

P Value of covariance matrix

Options Solver specific options


A.2 Importing data to Matlab

Matlab fuctions were implemented to parse the raw data into the mat file format,

which enables easier way to reprocess the data with Matlab. The Matlab functions

were implemented so that they work with both radio maps and with both file formats

provided by Wireshark. These Matlab functions are described next.

A.2.1 True coordinates of calibration points and test data

The true locations of the CPs are imported to Matlab with the function doCPs, which

allows the user to click the locations of the CPs in the map. The function doCPs takes

the name of the floor plan image and the type of the radio map as input parameters

and gives the names and the coordinates of the CPs for the radio map as an output.

The assisting functions are used to draw the floor plan and to insert a new coordinate

into the structure.

The true coordinates of the track can be imported with the function

truetrackGenerator, which allows user to click the coordinates of the track. The

number of time steps and number of corners in the track are given as input parame-

ters to the function.

A.2.2 Importing radio maps

As discussed, each fingerprint is stored into the text file, which has either csv or pdml

file format. The Matlab functions are used to parse the interesting data from these

text files and store it into the Matlab structure and further on to the mat file. This

is done with the function doMatfiles, which takes positions of the CPs and the type

of the radio map as input parameters. The function doMatfiles uses the function

importData to save the mat -files, which further uses the function parseFields to

parse the data from the text file. The function parseFields selects the correct parser

according to the output of the Wireshark -software and returns the time, RSSI value

and index of the AP of the data packages as the text output. If the text file is in

pdml file format, the parser returns also the channel information. The database of the

chosen APs is also given as an input to the function parseFields to find the samples

from the APs of interest, i.e. from the APs which locations are known.

In addition to the RSSI information, the mat files produced by function doMatfiles

contains the coordinates of the CPs generated by the function doCPs.


A.2.3 Importing test data

The function doMatfiles can be also used to parse the test data into the mat -file.

The function doMatfiles takes the name of the datafile as an input and gives Meas

-structure as an output. The channel information is also parsed into the mat -file, if

the raw data holds that information.

A.3 Parsing data for location estimators

The implemented location estimators, i.e. solvers, need essentially the radio map

and the test data to estimate the location of the MU. The Matlab functions are

implemented to parse the imported data into the structures, which can be given to

the location estimators.

A.3.1 Parsing radio map

The radio map structure described in Section A.1 is created with the function parsefp.

Table A.5 shows the several options implemented for radio map parsing. The vertices

Table A.5: Options in parsing the Radio map -structure

Option Explanation

Meas type 1 s averages or individual samples as measurements

Weak removal limit Remove weaker samples than limit

Bin width Histogram bin width, 1, 2, . . .

Type of uniform Add inverse of sample size or sample variance or minimum

frequency to all bins

Mean or median Use mean or median to compute 1 s average

Histogram type Frequency, proportion or proportion relative to maximum

bin height

of the grid cells are also needed in the radio map in addition to the CPs. The functions

radiomapAdd takes the Radio map structure as an input and gives the modified radio

map structure as an output. The modified Radio map structure includes the vertices

and areas of the cells corresponding the CPs. The function gives also the graph matrix,

which holds the connection information between the cells.


A.3.2 Parsing measurement structure

The test data is parsed using the function parseTrack, which converts the mat -file

into the Meas -structure. The same options as for parsing the radio map can be used

for parsing the test data. In addition, it is possible to choose the options described in

Table A.6.

Table A.6: Options in parsing the Meas -structure

Option Explanation

Time step Length of time step

Separate channels∗ Separates samples from different channel

Number of samples Number of samples used to infer estimate

The set of APs considered Only samples from certain APs considered

Number of strongest APs Samples from n strongest APs considered

Minimum number of samples Discard samples, if frequency is smaller than limit

A.4 Location solvers

All the location estimation algorithms were implemented into the three different lo-

cation solvers. The first solver is fpsolver, which includes the different methods

to compute the likelihood in the radio map cells. It also includes the deterministic

KNN and WKNN methods. The function fpsolver has also an option to compute

the bayesian posterior estimate, when the prior estimate is computed using the graph

model described in Section 3.2. The solvers kfsolver and bluesolver include the

linear Kalman filter and the non-linear Kalman filter algorithms, respectively.

After the parsing of the radio map- and Meas-structures, the data can be exported

to the solvers. At first, the solver is initialized by giving the initial values of the

fields in the Solv -structure and after that it is run recursively. At each time step the

measurements from the test data, i.e. one entry from the Meas -structure, is given as

input to the solver to estimate the location of the MU.

Documents

Ville Honkavirta Location fingerprinting methods in ...math.tut.fi/fi/wp-content/uploads/2008/11/di-ville-honkavirta.pdf · Location fingerprinting methods in wireless local area