12
Research Article CEnsLoc: Infrastructure-Less Indoor Localization Methodology Using GMM Clustering-Based Classification Ensembles Beenish Ayesha Akram , 1 Ali Hammad Akbar , 1 and Ki-Hyung Kim 2 1 Department of Computer Science and Engineering, University of Engineering and Technology, Lahore, Pakistan 2 Department of Computer Engineering, Graduate School, Ajou University, Suwon, Republic of Korea Correspondence should be addressed to Beenish Ayesha Akram; [email protected] Received 7 June 2018; Accepted 19 August 2018; Published 1 October 2018 Academic Editor: Subramaniam Ganesan Copyright©2018BeenishAyeshaAkrametal.isisanopenaccessarticledistributedundertheCreativeCommonsAttribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited. Indoor localization has continued to garner interest over the last decade or so, due to the fact that its realization remains a challenge. Fingerprinting-based systems are exciting because these embody signal propagation-related information intrinsically as compared to radio propagation models. Wi-Fi (an RF technology) is best suited for indoor localization because it is so widely deployed that literally, no additional infrastructure is required. Since location-based services depend on the fingerprints acquired through the underlying technology, smart mechanisms such as machine learning are increasingly being incorporated to extract intelligible information. We propose CEnsLoc, a new easy to train-and-deploy Wi-Fi localization methodology established on GMM clustering and Random Forest Ensembles (RFEs). Principal component analysis was applied for dimension reduction of raw data. Conducted experimentation demonstrates that it provides 97% accuracy for room prediction. However, artificial neural networks, k-nearest neighbors, K , FURIA, and DeepLearning4J-based localization solutions provided mean 85%, 91%, 90%, 92%, and 73% accuracy on our collected real-world dataset, respectively. It delivers high room-level accuracy with negligible response time, making it viable and befitted for real-time applications. 1. Introduction Positioning systems aka localization systems both for outside and inside buildings is an ever-exciting area of research and development due to increasing market shares as in smart buildings, assistive and assisted living, safer metropolitans using geographical information systems, and tracking of IoT objects for commercial purposes. User localization is poised to reach 2.6 billion dollars’ worth of market share soon [1], specially involving indoor localization solutions. Localiza- tion for the outdoors has been successfully commercialized in the form of satellite-based technologies such as GPS, BeiDou, GLONASS, COMPASS, and GALILEO [2]. Indoor positioning cannot be performed using the same technol- ogies because of No-line-of-sight (NLOS) and occlusion. Radio frequency (RF) signals, on the contrary, do not require explicit LOS for operation. A received signal strength indicator (RSSI) as a measure of RF signal quality for Wi-Fi, RFID, Bluetooth [3], ZigBee [4] and ultrawideband [5] has been used in several indoor positioning systems. Moreover, several kinds of sensory input such as images [6], video, ambient sound [7], accel- erometer [8], magnetometer [9], pedometer, gyroscope readings, and their amalgamation with various aforesaid RF signals [10] have also been explored. Several approaches and their combinations such as time of arrival (TOA), time difference of arrival (TDOA), pedes- trian dead reckoning (PDR), and angle of arrival (AOA) [11] have also been utilized for indoors. Each of these has been shown to have serious limitations. For instance, at successive predictions, PDR suffers from error propagation, and precise clock synchronization between sender and receiver is a re- quirement for TOA- and TDOA-based systems. In addition, specialized antennas are needed for AOA-based systems. RSSI-based systems are based on two widely adopted mechanisms; location estimation using formalized propa- gation models or fingerprints. Location systems that are based on the former suffer from low precision at run time Hindawi Mobile Information Systems Volume 2018, Article ID 3287810, 11 pages https://doi.org/10.1155/2018/3287810

CEnsLoc: Infrastructure-Less Indoor Localization ...downloads.hindawi.com/journals/misy/2018/3287810.pdf · 4.Proposed Localization Methodology 4.1. Problem Formulation. We assume

  • Upload
    others

  • View
    5

  • Download
    0

Embed Size (px)

Citation preview

Page 1: CEnsLoc: Infrastructure-Less Indoor Localization ...downloads.hindawi.com/journals/misy/2018/3287810.pdf · 4.Proposed Localization Methodology 4.1. Problem Formulation. We assume

Research ArticleCEnsLoc Infrastructure-Less Indoor Localization MethodologyUsing GMM Clustering-Based Classification Ensembles

Beenish Ayesha Akram 1 Ali Hammad Akbar 1 and Ki-Hyung Kim 2

1Department of Computer Science and Engineering University of Engineering and Technology Lahore Pakistan2Department of Computer Engineering Graduate School Ajou University Suwon Republic of Korea

Correspondence should be addressed to Beenish Ayesha Akram beenishayeshaakramuetedupk

Received 7 June 2018 Accepted 19 August 2018 Published 1 October 2018

Academic Editor Subramaniam Ganesan

Copyright copy 2018 Beenish Ayesha Akram et al(is is an open access article distributed under the Creative Commons AttributionLicense which permits unrestricted use distribution and reproduction in any medium provided the original work isproperly cited

Indoor localization has continued to garner interest over the last decade or so due to the fact that its realization remainsa challenge Fingerprinting-based systems are exciting because these embody signal propagation-related information intrinsicallyas compared to radio propagation models Wi-Fi (an RF technology) is best suited for indoor localization because it is so widelydeployed that literally no additional infrastructure is required Since location-based services depend on the fingerprints acquiredthrough the underlying technology smart mechanisms such as machine learning are increasingly being incorporated to extractintelligible information We propose CEnsLoc a new easy to train-and-deploy Wi-Fi localization methodology established onGMM clustering and Random Forest Ensembles (RFEs) Principal component analysis was applied for dimension reduction ofraw data Conducted experimentation demonstrates that it provides 97 accuracy for room prediction However artificial neuralnetworks k-nearest neighborsKlowast FURIA and DeepLearning4J-based localization solutions providedmean 85 91 90 92and 73 accuracy on our collected real-world dataset respectively It delivers high room-level accuracy with negligible responsetime making it viable and befitted for real-time applications

1 Introduction

Positioning systems aka localization systems both for outsideand inside buildings is an ever-exciting area of research anddevelopment due to increasing market shares as in smartbuildings assistive and assisted living safer metropolitansusing geographical information systems and tracking of IoTobjects for commercial purposes User localization is poisedto reach 26 billion dollarsrsquo worth of market share soon [1]specially involving indoor localization solutions Localiza-tion for the outdoors has been successfully commercializedin the form of satellite-based technologies such as GPSBeiDou GLONASS COMPASS and GALILEO [2] Indoorpositioning cannot be performed using the same technol-ogies because of No-line-of-sight (NLOS) and occlusionRadio frequency (RF) signals on the contrary do not requireexplicit LOS for operation

A received signal strength indicator (RSSI) as a measureof RF signal quality for Wi-Fi RFID Bluetooth [3] ZigBee

[4] and ultrawideband [5] has been used in several indoorpositioning systems Moreover several kinds of sensoryinput such as images [6] video ambient sound [7] accel-erometer [8] magnetometer [9] pedometer gyroscopereadings and their amalgamation with various aforesaid RFsignals [10] have also been explored

Several approaches and their combinations such as timeof arrival (TOA) time difference of arrival (TDOA) pedes-trian dead reckoning (PDR) and angle of arrival (AOA) [11]have also been utilized for indoors Each of these has beenshown to have serious limitations For instance at successivepredictions PDR suffers from error propagation and preciseclock synchronization between sender and receiver is a re-quirement for TOA- and TDOA-based systems In additionspecialized antennas are needed for AOA-based systems

RSSI-based systems are based on two widely adoptedmechanisms location estimation using formalized propa-gation models or fingerprints Location systems that arebased on the former suffer from low precision at run time

HindawiMobile Information SystemsVolume 2018 Article ID 3287810 11 pageshttpsdoiorg10115520183287810

because of variability in channel behavior including fadingand shadowing and also due to heterogeneity of device typesand form factors [12]

Wi-Fi networks are deployed as in Access Points (APs)that are prevalent everywhere Utilizing these to captureRSSI fingerprint (FP) is conveniently possible with device assimple as smartphones or phablets Wi-Fi fingerprint-basedlocalization has the following benefits no requirement ofextra hardware at both sender and receiver sides utilizationof already existent infrastructure easily implementable andno essential need of propagation model building which mayor may not depict real signal propagation at run time [13]

(e infrastructure of APs allows us to collect a datasetcomprising FPs on selected Reference Points (RPs) thatessentially becomes a cue to the physical layout of thebuilding like a map It is then utilized to prepare the lo-calization system as in the training phase Once trained thesystem is ready to be used ie for an unseen FP capturedanew a room number or an associated label is returned bythe system to estimate the location

In this paper a new localization methodology is pre-sented that uses a combination of data reduction techniqueas in principal component analysis (PCA) soft clusteringtechnique such as Gaussian mixture model (GMM) andbootstrapped aggregatedbagged ensembles of decision treescommonly referred to as Random Forest Ensembles (RFEs)We aspire to provide infrastructure-less indoor localizationmethodology which is scalable easy to deploy and providesreal-time response for high room-level accuracy instead ofexplicit coordinates First of all PCA was employed for rawdata dimension reduction (en clustering was performedto split the data in similar groups to help classifier betterlearn the data dynamics Finally a separate RFE is trained forevery single cluster

(e remainder of the paper has been organized as fol-lows Section 2 presents related work Preliminary experi-mentation results are summarized in Section 3 Section 4provides details of the localizationmethodology that we haveproposed Section 5 delves into experimental setup layoutand results for validation Conclusion along with possiblefuture directions is showcased in Section 6

11 Scope of the Study It is important to declare the scope ofthe study here to give a prelude to the paper that follows(epaper is aimed at proposing a newmethodology that is put toapplication in a practical and particular environment (eresulting constraints and limitations of our experimentalregime are quite natural and fairly generalizable in terms ofspatiotemporal aspects specifically for building size andtype fingerprintsrsquo collecting device type and time of theyear (e fingerprints were collected at the Software Engi-neering Centre of our own university (e building isdouble-storey and has architectural diversity in terms ofrooms corridors and an inlaying garden Nonetheless anextensive dataset spanned over multiple buildings can beobtained to ensure wider spatial diversity Our dataset wascollected using a single Android phone however witha large team using a variety of devices data collection can be

performed over different times of the year to obtain databoth for training and testing of our approach

2 Related Work

We summarize here the work on IPS based on Wi-Fi that istypically aWLAN standard IEEE80211 (b a g ac or any) ora combination of Wi-Fi with another wireless or sensoryinput RADAR [14] from Microsoftreg labs is the pioneerresearch work to employ Wi-Fi signals Wi-Fi signals re-ceived at the base stations (Access Points) from a laptop wereused for predicting the userrsquos coordinates using k-NN-basedmethod and triangulation reporting a median error of 2-3m Li et al [12] combined affinity propagation as a messagepassing-based clustering algorithm with PSA-based artificialneural network (ANN) for the Cartesian coordinates-basedprediction A mean error as low as 189m was reported bythem including 29m for 90 of estimates

Song et al [13] analyzed FP collection as an AP relevancyproblem Hidden Naıve Bayes (HNB) was used as a mech-anism to infer the most relevant APs and suggested thatredundant APs may be obviated for each RP througha variant of ReliefF with the Pearson product-momentcorrelation coefficient (PPMCC) Moreover clusteringwas performed on RPs One HNB was trained per cluster toapproximate user coordinates Cooper et al [15] employedcombination of Wi-Fi and Bluetooth low energy radio signalFPs based on boosting technique targeting the room-levelprediction (ey trained one classifier per room based ona variant of AdaBoost that conveniently harnessed de-cision stumps in one-vs-all notion Using combination ofWi-Fi + BLE they acquired 96 accuracy with 43E minus 03seconds response time Wang et al [16] similar to Li et alpresented training of ANN with back propagation that wasbased on PSA for RSSI measurements of RFID tags (eyperformed data preprocessing by normalizing dataset to [0 1]range along with using Gaussian filter(ey estimated x and ycoordinates reporting a mean error of 034m

Xu et al [17] utilized multilayer neural network (MLNN)for Wi-Fi signals along with network boosting (ey trainedtheMLNN in two stages commonly followed in deep learningnamely pretraining using autoencoders and fine tuning usingback propagation algorithm reporting a mean error of 109mZhang et al [18] proposed a coarse localizer composed offour-layered deep neural network using stacked denoisingautoencoders succeeded by the hidden Markov model-basedfine localizer reporting a mean error of 039m

Calderoni et al [19] utilized RFID tagsrsquo RSSI valuestargeting room-level accuracy in a hospital environment(ey divided the total area into macroregions using k-meansvariant followed by a Random Forest trained per macro-region Multiple random forests for whom the clustermatching score was greater than a particular threshold de-termined the final prediction with 83 reported accuracyJedari et al [20] investigated room-level prediction using k-NN rule-based JRip and Random Forest classifier based onWi-Fi signals (ey concluded that Random Forest producedmuch better results than k-NN (774) and JRip (722) with913 accuracy

2 Mobile Information Systems

Mo et al [21] proposed the usage of kernel PCA (KPCA)algorithm for the coarse-level prediction of manually la-belled cluster using Random Forest (ey derived trainedmatrices from extracted KPCA features and prepared sub-radio maps For prediction the features extracted fromcoarse positioning refined by the trained KPCA matriceswere fed to weighted k-NN (WK-NN) for final coordinatesestimation (ey reported an accuracy of 93 with an errordistance of 2m Gorak et al [22] employed Random Forestfor finding important APs and applied threshold-basedelimination (ey determined malfunctioning APs duringoperation based on important APs (ey were able to reporterror rates as low as 4 for detecting the floor and as forhorizontal detection 2m error was reported (e resultswere compared against 30 and 7m respectively whenmalfunctioning AP were undetected (ey [23] divided FPdataset both into subsets which were either overlappingandor nonoverlapping as per the presence of RSSI fromevery AP Furthermore one Random Forest was trained foreach such subset (ey compared the results with baseRandom Forest signifying around 5 to 9 betterment inaverage reported error at the floor level (e performance interms of detection of floors was unchanged

Aforementioned are some recent efforts in field of in-door localization using RF signals k-NNworks by storing allthe data samples along with marked ground truth labels Anunseen sample is compared with complete dataset based ona similarity measuredistance to determine nearest neigh-bors where k is the number of neighbors and neighborsrsquoweightage(e final decision of the samplersquos class is based onthe majority of the labels of the k-nearest neighbors SeveralIPS such as in Oussalah et al [11] and Niu et al [24] based onk-NN and its variants do not scale well when the datasetgrows because they require FP matching with whole datasetMoreover recently artificial neural networks and deeplearning have gained a great deal of focus for indoor lo-calization Artificial neural networks try to mimic humanbrain(ey formmultiple layers of neurons namely a singleinput layer a single output layer and one or more hiddenlayers At every layer several neurons are connected to oneanother according to a specific configuration and triggeringfunction Every layer affects and triggers neurons in the nextlayer following rules of the learning function which even-tually evolves into the final output Heavy resource utili-zation is required during the training phase however theirresponse time is negligible due to minimal required com-putation IPS employing ANNs and deep learning such as Liet al [12] Ding et al [25] Leon et al [26] Zhang et al [18]and Tuncer and Tuncer [27] faces the challenge of findingseveral tunable parameters such as the optimal architectureno of layers learning function and no of neurons at everylayer Moreover the convergence rate and the final accuracyof various configurations do not follow any specific trendSometimes a 2-layer simpler configuration takes more timeto converge than a 4-layer network and the accuracyachieved by a 6-layer network is lower than the accuracyobtained by a 3-layer ANN Hence heuristics are pre-dominantly used for proposing an architecture based onANNs because Monte Carlo configuration testing is

impossible An AP location change or the additionremovalof newAPs will lead to essential retraining of the IPS puttingANN and deep learning at a disadvantage

Inspired from existing works we suggest a clustering-based multiclass classifier approach for room-level pre-diction which is easy to train-and-deploy in terms ofcomputational complexity provides suitable accuracy andoffers response time appropriate for real-time applicationsClustering follows the divide and conquer approach to helpthe classifier better learn the group of similar observationsinstead of the whole dataset We perform soft classification(clustering) of FPs not for RPs which has been mostly donein the existing works PCA-based data dimension reductionhelps us to reduce the response time of the system by de-creasing the number of predictors

(is approach scales well in terms of response time withan increase in the number of rooms since a single classifier isinvoked for each location providing maximum accuracyreported so far to the best of our knowledge

3 Preliminary Experiments

Preliminary experiments were performed on sample datasetcollected from our departmental building to evaluate clas-sifier suitability [28] (e results presented here are on thesample dataset (e detailed experimentation results on thecomplete dataset of all locations in building are presented inSection 5

31 Dataset Acquisition A customized app was developedfor an Android phone built to record RSSI as vector datacoming fromWi-Fi APs(eWi-Fi FPs of all observable APsboth within 24 and 5GHz bands at a RP were scanned usinga commercial off-the-shelf Samsung phone (Version J5Galaxy) FPs were obtained at each RP while hovering thesmart phone starting at 0deg up to right angle (90deg) with respectto the floor as shown in Figure 1 so as to make the phoneface N NW W SW S SE E and NE with an effort to keepocclusion as minimum as possible due to the human torso[5 9] (e user stood at the centre of the RP held the phoneused for FP collection in his hand and capturedmultiple FPsin each direction out of the total 8 directions shown inFigure 1 A total of A APs were present in the premisescontinuously emitting radio signals (ese FPs were thenstored into DB each with respective room labels

32 Preprocessing (e resultant dataset was found to besparsely populated with the identifiers of APs because of thepresenceabsence of these APs at various RPs labelled inrooms and in the corridors of building (e measured RSSIvalues varied between minus98 dBm and minus15 dBm (from beingweak to being strong as a result of distance from APs) Beingconsistent with the well-known practice to keep the missingvalues slightly weaker than the weakest signal detected in thedataset [12 14 18 19] the missing values were replaced withminus100 dBm

Mobile Information Systems 3

33 Classier Evaluation We evaluated performance of 60+classiers in WEKA for room-level prediction on sampledataset out of which top ten best performing classierperformances are summarized in Table 1

Taking into account all the performance measures suchas accuracy to receiver operating characteristics (ROC) areathe overall performance best attained (descending order)was by Klowast k-nearest neighbors (k-NN) Random ForestEnsemble (RFE) and algorithm for Fuzzy Unordered RuleInduction Algorithm (FURIA) multilayer perceptron deeplearning for JAVA (Dl4jMlpClassier) support vectormachines Naive Bayes classier and nally AdaBoost thatuses stumps for decision-making Klowastand k-NN both areinstance-based classiers and produced similar performanceresults Followed by RFE FURIA and multilayer perceptron(ANN) FURIA and ANN show almost similar performancetrend We selected k-NN and ANN for comparison as manyexisting works had utilized these which makes comparisonwith other works easier Moreover we chose Klowast FURIAand DeepLearning4J classiers for comparison too as theyare state-of-the-art and relatively new machine learningmethods RFE is suited for large datasets to give high ac-curacy and time eciency It is resilient to noise in data andis also capable of dealing with missing values in data RFEutilizes bootstrapping that reduces variance and keeps thebias in check because creating dierent subsets of trainingdataset along with a replacement mechanism ensures thatthe trees have little or no correlationerefore overtting isavoided making it more generalizable Both trainingand prediction time of RFE due to parallel computationsupported by bagging make it suited for real-time imple-mentation of IPS Hence we selected RFE [29] as the suitableclassier module in our proposed methodology

4 Proposed Localization Methodology

41 Problem Formulation We assume localizationpositioning as a combination of clustering and multiclass

classication problem where each room is considered to bea class A two-dimensional indoor area is partitioned into Rsquare grids of dimensions CtimesDm2 e centre of eachsquare grid is a reference point (RP) A device equipped withthe wireless adapter card can sense wireless signals froma total of A AP-s at a certain RP at a given time which formsthe ngerprint FPi RVi Li RVi rssii1 rssii2 rssii3 rssiiA where rssiij symbolizes the RSSI value from jth AP(dBm) in the ith sample of FP collected and Li is the re-spective classroom label Let N such FP constitute thedataset Localization function LF is learnt from the FPdataset to map the observed FP to a certain room label Lx asdescribed by the following equation

Lx LF FPi( ) (1)

ere are two phases of the proposed methodology(CEnsLoc) namely training phase and prediction phase asshown in Figure 2 First of all a sparse Wi-Fi FP dataset withmany missing values is collected In the training phase thecollected FPs reserved for training the system are pre-processed for missing values replacement followed by PCAapplication for dimension reduction en GMM-basedhard clustering is used for nonoverlappingdisjoint datasubsets generation For each such subset a separate RFE istrained for location prediction and stored in the database Inthe location prediction phase same steps of missing valuereplacement and PCA computation are performed on thecaptured FP en FP is matched with a single cluster usingthe stored GMM e nal prediction Lx is generated byinvoking the respective pretrained RFE for the best matchedclustersubset ese phases are formally elaborated in detailin Sections 42 and 43 respectively

42 Training Phase During training the training dataset isfed to the preprocessing module which replaces emptyreadings with missing value replacement (MVr) PCA is thenperformed on the dataset for dimension reduction Or-thogonal transformation is applied by PCA for redundantinformation removal to decrease the number of predictorsPrincipal components (PCs) were obtained by applyingPCA which are a set of linearly uncorrelated variables suchthat maximum variance by some projection of the data iscaptured by the rst principal component and so onChoosing the smaller of the number of predictorsAPs andnumber of samples minus one A PCs are generated PC1PC2 PCA For computation of PCs rst the mean RSSIvalue of each AP is subtracted from the ith RP using thefollowing equation

Xi 1NsumN

FPi minus1R

1NsumR

sumN

FPi (2)

where N is the total no of rows of samplesdataset and R isthe total no of RPs e PCs matrix is computed by thefollowing equation

PCi Xi times EA (3)

where EA is the eigenvector matrix of the average RSSI valueof each AP in the ith RPe resulting dataset is then divided

SE

N

S

EW

NENW

SW

AP2

AP1

AP3

APA

Figure 1 User orientation during data collection

4 Mobile Information Systems

into subsets using GMM clustering (K data subsets)Equation (4) is a distribution based on 2D Gaussian wheremean is represented by micro and the covariance matrix is sum AGMM with N as the no of overlapping distributions isdescribed by the following equations

N(x | μΣ) 1

2π|Σ|

radic exp minus12(xminus μ)TΣminus1(xminus μ) (4)

P(x) sumN

k1πkN x μkΣk

∣∣∣∣( ) (5)

sumN

k1πk 1 (6)

where πk denes the mixing coecient to express the weightof each mixing element (weighted sum being 1) e re-sultant shape of 2D Gaussian is the average of distributionsindividually in terms of covariance and mixing coecientsAssuming that a linear mix of weighted coecients for eachof the respective distributionrsquos average and covariance isobtained and by incorporating sucient distributionsa nal density function may be obtained e reason behindGMM clustering was the similarity between Gaussian dis-tribution and the radio propagation characteristics of a Wi-Fi AP [30] which makes GMM a highly suitable candidatefor clustering Wi-Fi RSSI vectors

Furthermore each data subset is used for training a RFEto predict the room label as a multiclass classier etrained models of GMM as well as all RFEs are stored forlater use in the prediction phase

e algorithm for training and prediction phases ofCEnsLoc is given in Algorithm 1

Let the total number of samples beNsample in the trainingset the number of trees is Ntree the number of maximumsplits allowed is Smax the number of predictors for theclassier is Arsquo f is the value specifying no of input predictorsthat are utilized to split at a tree node and tc denotes thetotal no of classes in the dataset For nding best split RFEuses Gini Index as given in the following equation where Pj isthe class jrsquos relative frequency in Nsample

Gini Nsample( ) 1minus sumtc

j1Pj( )

2 (7)

RFE is trained using the method presented inAlgorithm 2

43 PredictionPhase e average of collected RSSI FP is fedto CEnsLoc A PCs are computed by the method similar tothe training phase e saved model of GMM is invoked forcluster matching Matched clusterrsquos trained RFE is invokedwhere the nal decision is computed by the majority votedescribed in Algorithm 2

44 Time Complexity of Training and Prediction Phases forCEnsLoc Ceteris paribus the time complexity of thetraining phase and the prediction phase for CEnsLoc isessentially dependent upon the size of the experimental area

One RFE trained for each cluster

subset

Invokepretrainedrespective

RFE

Wi-Fi RSSI FPswith room labels

Estimated room

Training phasePrediction phase

AP1AP2AP3APA

Fingerprint(s)

RSSI1

RSSI2

RSSI3

RSSIA

Li

Lx

Wi-Fi FPs collection Run time observed Wi-Fi FP

PCA

Preprocessing

Generate data subsets

(GMM clustering)

Cluster matching

Cluster selection

Figure 2 Proposed localization methodology (CEnsLoc)

Table 1 A comparison of performance measures of various classication algorithms for Wi-Fi-based position estimation

Algorithm Time to build model (sec) Accuracy Kappa statistic RMSE Precision Recall F1 MCC ROC areaKlowast 0 9952 098 002 099 099 099 099 1k-NN 0 9906 099 003 099 099 099 098 099RFE 111 9876 098 004 098 098 098 098 1FURIA 592 9726 096 005 097 097 097 096 099Multilayer perceptron 2584 9705 096 006 097 097 097 096 099J48 01 9591 095 008 095 095 095 095 098Dl4jMlp classier 2631 94 093 008 094 094 094 093 099SVM 582 906 089 012 093 090 090 09 094Naive Bayes 002 8979 088 012 091 089 089 089 099AdaBoost with decision stump 01 3681 022 025 015 036 021 018 076

Mobile Information Systems 5

our acquisition regimen the resultant dataset of FPs andhow the dataset is manipulated by the tandem of schemes weemploy

441 Time Complexity of Training In the training phase forPCA time complexity is governed by the following equation

O min A3 N

31113872 11138731113872 1113873 (8)

where A no of predictors and N no of observationsFor training a decision tree (DT) that has not been

pruned the expression is as follows

O(A times N log(N)) (9)

As RFE consists of numerous DTs merely a small no f isused from total predictors A Complexity for a single DT inRFE is represented by Equation (10) and the complexity ofNtree by Equation (11)

O(f times N log(N)) (10)

O Ntree times f times N log(N)( 1113857 (11)

whereNtree no of trees in RFE and f random features thatare chosen to get the best split

While trying to control treesrsquo depth grown using Smaxtraining complexity of one RFE is as follows

O Ntree times f times N times Smax( 1113857 (12)

SinceK ensembles are grown for predicting room level timeto train complexity is represented by the following equation

O Ntree times f times N times Smax times K( 1113857 (13)

For GMM the complexity is expressed by the followingequation

O N times K times D3

1113872 1113873 (14)

where N no of samples K no of components andD no of dimensions

Incorporating all the time complexity to train forCEnsLoc is as follows

O min A3 N

31113872 11138731113872 1113873 + O N times K times D

31113872 1113873

+ O Ntree times f times N times Smax times K( 1113857(15)

442 Time Complexity of Prediction For prediction timecomplexity for PCA is given by the following equation

O min A3 N

31113872 11138731113872 1113873 (16)

(e complexity for a DT and an ensemble in terms ofprediction time are shown by Equations (17) and (18)respectively

O(N log(N)) (17)

O Ntree times N log(N)( 1113857 (18)

Smax controls trees depth therefore complexity for anensemble is given by the following equation

O Ntree times N times Smax( 1113857 (19)

Only a single RFE out of K ensembles gets invoked forpredicting a room

(erefore time complexity for GMM is given by thefollowing equation

O K times D3

1113872 1113873 (20)

Finally the overall complexity for CEnsLoc in terms ofprediction is as follows

O min A3 N

31113872 11138731113872 1113873 + O K times D

31113872 1113873 + O Ntree times N times Smax( 1113857

(21)

5 Experimental Results and Discussion

(is section entails hardware equipment software used andparticulars of experiments that were used to evaluate theperformance of CEnsLoc in light of accuracy precisionrecall training and response time

An Intel machine (64-bit Xeon X5650) with a masterclock at 267 GHz with 24GB RAM and 64-bit Windows 10Education was used for experimentation in MATLAB (ereal dataset was developed through FP collection at theground floor of Software Engineering (SE) Centre Uni-versity of Engineering and Technology (UET) LahorePakistan (e buildingrsquos dimensions are 39m times 31m(1209m2) containing offices class rooms laboratories andopen corridors Figures 3 and 4 depict the buildingrsquos floorplan room labels (L1ndashL10 closed rooms L12 open corri-dor and L11 a semiopen room) a total of 180 RPs and the

Input training dataset with total A predictorsMissing value replacement MVrMaximum number of clusters Kmax

Output predicted location LxFor trainingReplace empty values with MVrApply PCA on dataset to generate Arsquo predictorsFor k 1 -gtKmaxGenerate clustersGenerate and save k data subsetsFor each p є k data subsetsTrain p RFE using Algorithm 2 (training)Calculate performance measures

End forEnd forChoose optimal configurationSave respective models for GMM all RFEsFor prediction at a new point xReplace missing values with MVrApply PCA on the FPMatch one cluster CmatchInvoke RFE of Cmatch using Algorithm 2 (prediction)

ALGORITHM 1 CEnsLoc training and prediction algorithm

6 Mobile Information Systems

number of samples collected per room RPs that are rep-resented by small coloured dots in rooms in Figures 3 and 4enlist the total number of samples collected at each locationL1ndashL12 Such notion of rooms was used as walls playinga crucial role in fluctuation of Wi-Fi RSSI values [31 32](e area was planned into a grid of cells of 15times15m2Each cell center was marked as the Reference Point (RP) forFP collection

(e complete dataset consisted of 20087Wi-Fi RSSI FPsin which total 40APs were detected Figure 4 depicts thenumber of FPs captured in each roomlocation marked asL1ndashL12 It must be noted that all these APs belong touniversity infrastructure comprising of SE centre and itsimmediate neighboring buildings (e FPs were pre-processed following the same process described in Sections31 and 32 PCA-based dimension reduction was employedwith optimal results found with 23 PCs(e resultant datasetwas divided with 70 30 stratified ratio for training andtesting subsets (e experimental results are discussed usingboth 10-fold cross validation (10-CV) on training subset andon unseen 30 test subset GMM clustering then partitionedthe training data into further subsets and the optimalconfiguration was two clusters shared covariance kept astrue and diagonal covariance Furthermore one RFE wastrained for each subset with 132 trees 1024 maximum splitsand 8 random features CEnsLoc was hosted at a machineand the mean of observed RSSI FPs was used for run timelocation estimation after applying the same model of PCAand GMM cluster matching finalized from the trainingphase Best matched clusterrsquos respective RFE was invoked forroom prediction Various companion apps can query lo-cation of subscribed users using the IPS configured asa server Response time was computed as an average of thedifference between localization query time and predictiongeneration time (e following formulae were used tocompute the performance parameters

accuracy TP + TN

TP + TN + FP + FN

precision TP

TP + FP

recall TP

TP + FN

(22)

51 Classification Effectiveness and Efficiency Tables 2 and 3summarize the 10-CV performance evaluation of CEnsLoccomparing it with k-NN [33] artificial neural network(ANN) [34] Klowast [35] FURIA [36] and DeepLearning4J [36](e best results are highlighted throughout with boldface k-NN results were computed averaged over six differentconfigurations based on the number of neighbors similaritymeasure employed and neighbor weightage Similarly sixdifferent configurations of ANN namely 2- 3- and 4-layerANNs each having varying number of neurons per layeremploying two learning algorithms SCG and RBP wereaveraged out to obtain the results For Klowast the entropic blendpercentage was varied from 10 to 90 percent (e resultsobtained for FURIA were obtained by varying the count offolds for growth and pruning as well as through varyingminimum instances weight that was for each split Deep-Learning4J results were obtained by varying the number ofneurons per layer number of dense layers in the networkand training algorithm out of many possible variations in thehyper parameters(e results by CEnsLoc were generated bythe found optimal configuration of our proposed approachFor all the approaches the same set of configurations asused during 10-CV performance evaluation were used forresult generation on 30 unseen stratified test data subsetwhose results are presented in Table 4(e best performanceon both 10-CV results (Table 2) and test dataset (Table 4)

Input data subset with total Arsquo predictors for trainingNo of tree NtreeAllowed maximum no of splits SmaxRandom no of predictorsfeatures f

Output estimated location LxFor system trainingStep 1 for l 1 to Ntree

(i) Choose a bootstrap sample set (SS) of size (Nsample) with replacement from the training data subset(ii) Generate a Random Forest Tree (Tl) to SS via recursively iterating (a-c) for every terminal node of tree unless the maximum no of

splits (Smax) is reached

(a) Randomly select f featuresvariables from the Arsquo predictors (fltltArsquo)(b) Choose the best featuressplit-point from the f employing Gini Index(c) Split node forming into two children nodes

Step 2 Produce the resulting ensemble of trees Tl1Ntree

For location prediction at a new point x from RFE LrfNtreeLet Lm(x) be the roomclass prediction by the mth RFE treeLrfNtree(x)maj vote Lm(x)1Ntree

ALGORITHM 2 RFE classification algorithm for training and prediction

Mobile Information Systems 7

were obtained by CEnsLoc with 97 and 95 accuracyfollowed by FURIA (92 90) k-NN (91 89) Klowast (9087) ANN (85 82) and DeepLearning4J (73 71)respectively e results are presented on both 10-CV andunseen test data subset to show that for a small dataset 10-CV results can provide a good performance estimate asresults on the test data subset were found to be showingsimilar trends as depicted by the 10-CV results Althoughdeep leaning and ANN provide good performance resultsbut nding optimal conguration for a huge number oftunable parameters is a tricky task Moreover during ex-periments it was found that despite generic guidelines forparameter tuning the performance measures and conver-gence rate of ANN as well as deep learning schemes highlynotuctuate with even a slight variation in the number ofneurons in layers training algorithm and number of hiddenlayers which makes it harder to nd optimal congurationfor ANN and deep learning models Lazy and instance-basedapproaches such as k-NN and Klowast are also good candidates

for indoor localization however they have a very limitednumber of tunable parameters resulting in inability tosurpass CEnsLoc performance despite trying dierent pa-rameter combinations ey do not generalize well ascompared to RFE as the end result of prediction is heavilydependent on the majority vote by k closest matchedsamples in the dataset ey also need template matchingwith the entire dataset for one location prediction whichresults in growing response time with increasing numberof samples in the dataset which is highly likely in practicalreal-world scenarios As in a typical building there area quite huge number of visible APs and a suciently largenumber of samples are also required for classiers to workproperly

A minimum response time of 205Eminus 05 seconds wasobtained by FURIA DeepLearning4J stood second with682Eminus 05 seconds response time ANN k-NN and CEn-sLoc all had response time on the scale of Eminus 04 secondswhich is 10 times slower than aforementioned two

L1 L2 L3

L4

L5

L6

L7

L8

L10L9 L12

L11

L12

L12

L12

L12

Figure 3 Room labels and marked RPs

446 446717

2700 2700 2700 2700

1446

250743 743

4496

0500

100015002000250030003500400045005000

Number of samplesLocations

Num

ber o

f sam

ples

L1L2L3

L4L5L6

L7L8L9

L10L11L12

Figure 4 Number of samples per room

8 Mobile Information Systems

approaches but the difference was trifling which cannot bedetected by any human user of the system the accuracyprecision and recall provided by CEnsLoc was much greaterthan all other approaches

FURIA stands out as the second best performer re-garding indoor localization which is an upgraded version ofthe RIPPER algorithm indicating that rule-based algo-rithms specially fuzzified versions with good generalizationcapabilities perform well for location estimation as wellHowever it lagged behind CEnsLoc in terms of accuracyprecision and recall by 5 10 and 15 respectively

(e details of both 10-CV and test dataset results for allIPS compared and CEnsLoc are depicted in Figure 5 for sideby side visual comparison

52 Out-of-Bag (OOB) Error Results (ere is a performancemeasure called OOB error which is peculiar to RFE Duringtraining of RFE data subsets are generated with re-placement resulting in some repeated and left out obser-vations (OOB observations) for each tree (at particulartree does not train on these left out OOB observationsPrediction capability of RFE can be measured using ldquoOOBLossrdquo which is the error made on unseen OOB observations

during training OOB loss measure has been investigatedand shown to provide an upper bound on testing error [37]specifically useful for small-sized datasets Hence OOBerror can be used just likeinstead of unseen test data subsetif the available dataset is small or unseen test dataset isunavailable It provides a very good estimate of the trainedclassifierrsquos generalization capability Table 5 summarizes theOOB loss compared with averaged out 10-CV loss indicatingthat it indeed bounds it

6 Conclusion and Future Work

Location predictionestimation provides derivation ofmeaningful context for a broad range of services and ap-plications Indoor localization can open altogether a pleth-ora of new opportunities because humans spend most oftheir time indoors CEnsLoc offers shorter response timeand an overall improvement in accuracy precision andrecall With only a few parameters to be tuned it is suited forFP-based localization which requires frequent recollection ofdata and retraining CEnsLoc was able to attain 97 ac-curacy in comparison with other IPS averaged over 6 dif-ferent configurations namely FURIA k-NN Klowast ANN andDeepLearning4J with 92 91 90 85 and 73 ac-curacies respectively It can be utilized for elderly assistancenavigation smart buildings and smart transportation toname a few potential applications

Our future work includes deployment of CEnsLoc acrossa wide range of civil infrastructures including different floorsandor buildings to understand its performance in moredetail as well as its scalability using crowdsourcing We alsoaim to build safety security and evacuation guide appli-cations with CEnsLoc at their core for users in officesuniversities and retail Furthermore integration with GPSBluetooth and PDR to further enhance accuracy utilizing

Table 2 Tenfold cross-validated performance evaluation and comparison of CEnsLoc

Accuracy Precision Recallk-NN 091 088 087ANN 085 083 078Klowast 090 096 062FURIA 092 089 082DeepLearning4J 073 051 041CEnsLoc 097 098 097

Table 3 Tenfold cross-validated performance comparison of CEnsLoc in terms of training and response time (seconds)

Time (sec) Training time (10-fold) Avg training time (1-fold) Response time dataset Response time (1 sample)k-NN mdash mdash 197 170Eminus 04ANN 26720 2672 018 141Eminus 04Klowast mdash mdash 10309 817Eminus 02FURIA 1587 158 003 205E2 05DeepLearning4J 75559 755 009 682Eminus 05CEnsLoc 14007 14007 235 208Eminus 04

Table 4 Performance evaluation and comparison of CEnsLoc ontest subset

Accuracy Precision Recallk-NN 089 087 085ANN 082 081 076Klowast 087 094 060FURIA 090 086 079DeepLearning4J 071 048 039CEnsLoc 095 096 094

Mobile Information Systems 9

available hybrid technologies at a given time is also part ofthe plan

Data Availability

e data used to support the ndings of this study areavailable from the corresponding author upon request

Conflicts of Interest

e authors declare that they have no connoticts of interest

Acknowledgments

is research was supported by Basic Science ResearchProgram through the National Research Foundation ofKorea (NRF) funded by the Ministry of Education (NRF-2018R1D1A1B07048697)

References

[1] J Torres-Sospedra R Montoliu S Trilles O Belmonte andJ Huerta ldquoComprehensive analysis of distance and similaritymeasures for Wi-Fi ngerprinting indoor positioning sys-temsrdquo Expert Systems with Applications vol 42 no 23pp 9263ndash9278 2015

[2] H Mehmood and N K Tripathi ldquoCascading articial neuralnetworks optimized by genetic algorithms and integrated withglobal navigation satellite system to oer accurate ubiquitouspositioning in urban environmentrdquo Computers Environmentand Urban Systems vol 37 no 1 pp 35ndash44 2013

[3] M M Soltani A Motamedi and A Hammad ldquoEnhancingcluster-based RFID tag localization using articial neuralnetworks and virtual reference tagsrdquo Automation in Con-struction vol 54 pp 93ndash105 2015

[4] M L Rodrigues L F M Vieira and M F M CamposldquoFingerprinting-based radio localization in indoor environ-ments using multiple wireless technologiesrdquo in Proceedings ofIEEE International Symposium on Personal Indoor andMobile Radio Communications pp 1203ndash1207 Toronto ONCanada September 2011

[5] J Luo and H Gao ldquoDeep belief networks for ngerprintingindoor localization using ultrawideband technologyrdquo In-ternational Journal of Distributed Sensor Networks vol 12no 1 article 5840916 2016

[6] S Saeedi L Paull M Trentini and H Li ldquoNeural network-based multiple robot simultaneous localization and map-pingrdquo IEEE Transactions on Neural Networks vol 22 no 12pp 880ndash885 2011

[7] M Azizyan R R Choudhury and I Constandache ldquoSur-roundSense mobile phone localization via ambience nger-printingrdquo in Proceedings of the 15th annual internationalconference on Mobile computing and networking-Mobi-Comrsquo09 pp 261ndash272 Beijing China 2009

[8] L Pei R Chen J Liu et al ldquoMotion Recognition AssistedIndoor Wireless Navigation on a Mobile Phonerdquo in Pro-ceedings of the 23rd International Technical Meeting of eSatellite Division of the Institute of Navigation pp 3366ndash3375Portland OR USA September 2010

[9] P S Nagpal and R Rashidzadeh ldquoIndoor positioning usingmagnetic compass and accelerometer of smartphonesrdquo inProceedings of International Conference on Selected Topics inMobile and Wireless Networking (MoWNeT) pp 140ndash145Montreal Canada 2013

[10] J Menke and A Zakhor ldquoMulti-modal indoor positioning ofmobile devicesrdquo in Proceedings of IEEE International Con-ference on Indoor Positioning and Indoor Navigationpp 13ndash16 Alberta Canada October 2015

[11] M Oussalah M Alakhras and M I Hussein ldquoMultivariablefuzzy inference system for ngerprinting indoor localizationrdquoFuzzy Sets and Systems vol 269 pp 65ndash89 2015

[12] N Li J Chen Y Yuan X Tian Y Han and M Xia ldquoA wi-indoor localization strategy using particle swarm optimizationbased articial neural networksrdquo International Journal ofDistributed Sensor Networks vol 12 no 3 article 45831472016

002040608

112

k-N

N

4-la

yer A

NN

(SCG

)

4-la

yer A

NN

(RBP

)

3-la

yer A

NN

(SCG

)

3-la

yer A

NN

(RBP

)

2-la

yer A

NN

(SCG

)

2-la

yer A

NN

(RBP

)

Klowast

FURI

A

Dee

pLea

rnin

g4J

CEns

Loc

k-N

N

4-la

yer A

NN

(SCG

)

4-la

yer A

NN

(RBP

)

3-la

yer A

NN

(SCG

)

3-la

yer A

NN

(RBP

)

2-la

yer A

NN

(SCG

)

2-la

yer A

NN

(RBP

)

Klowast

FURI

A

Dee

pLea

rnin

g4J

CEns

Loc

10-CV Test dataset

PrecisionRecallAccuracy

Figure 5 Performance measures on both 10-CV and test dataset

Table 5 OOB loss

Average 10-fold loss OOB loss00292813 00300123

10 Mobile Information Systems

[13] C Song J Wang and G Yuan ldquoHidden naive bayes indoorfingerprinting localization based on best-discriminating APselectionrdquo ISPRS International Journal of Geo-Informationvol 5 no 10 p 189 2016

[14] P Bahl and V N Padmanabhan ldquoRADAR an in-building RFbased user location and tracking systemrdquo in Proceedings ofIEEE INFOCOM 2000 Conference on Computer Communi-cations Nineteenth Annual Joint Conference of the IEEEComputer and Communications Societies (Cat No00CH37064) vol 2 pp 775ndash784 Tel Aviv Israel March 2000

[15] M Cooper J Biehl G Filby and S Kratz ldquoLoCo boosting forindoor location classification combining Wi-Fi and BLErdquoPersonal and Ubiquitous Computing vol 20 no 1 pp 1ndash142016

[16] C Wang F Wu Z Shi and D Zhang ldquoIndoor positioningtechnique by combining RFID and particle swarmoptimization-based back propagation neural networkrdquo Optik(Stuttg) vol 127 no 17 pp 6839ndash6849 2016

[17] J Xu H Dai and W Ying ldquoMulti-layer neural network forreceived signal strength-based indoor localisationrdquo IETCommunications vol 10 no 6 pp 717ndash723 2016

[18] W Zhang K Liu W Zhang Y Zhang and J Gu ldquoDeepneural networks for wireless localization in indoor andoutdoor environmentsrdquo Neurocomputing vol 194 pp 279ndash287 2016

[19] L Calderoni M Ferrara A Franco and D Maio ldquoIndoorlocalization in a hospital environment using Random Forestclassifiersrdquo Expert Systems with Applications vol 42 no 1pp 125ndash134 2015

[20] E Jedari Z Wu R Rashidzadeh and M Saif ldquoWi-Fi basedindoor location positioning employing random forest clas-sifierrdquo in Proceedings of 2015 International Conference onIndoor Positioning and Indoor Navigation (IPIN) pp 13ndash16Banff Albeta Canada October 2015

[21] Y Mo Z Zhang Y Lu W Meng and G Agha ldquoRandomforest based coarse locating and KPCA feature extraction forindoor positioning systemrdquo Mathematical Problems in En-gineering vol 2014 Article ID 850926 8 pages 2014

[22] R Gorak and M Luckner ldquoMalfunction immune WindashFilocalisationmethodrdquo in Computational Collective Intelligencepp 328ndash337 Springer Cham Switzerland 2015

[23] R Gorak and M Luckner ldquoModified random forest algorithmfor WindashFi indoor localization systemrdquo in Proceedings of In-ternational Conference on Computational Collective Intelligencepp 147ndash157 Cham Switzerland September 2016

[24] J Niu B Wang L Cheng and J J P C Rodrigues ldquoWicLocan indoor localization system based on WiFi fingerprints andcrowdsourcingrdquo in Proceedings of 2015 IEEE InternationalConference on Communications (ICC) pp 3008ndash3013 Lon-don UK June 2015

[25] G Ding Z Tan J Zhang and L Zhang ldquoFingerprintinglocalization based on affinity propagation clustering and ar-tificial neural networksrdquo in Proceedings of IEEE WirelessCommunications and Networking Conference (WCNC)pp 2317ndash2322 Shanghai China April 2013

[26] O Leon J Hernandez-Serrano and M Soriano ldquoSecuringcognitive radio networksrdquo International Journal of Commu-nication Systems vol 23 no 5 pp 633ndash652 2010

[27] S Tuncer and T Tuncer ldquoIndoor localization with bluetoothtechnology using artificial neural networksrdquo in Proceedings ofthe IEEE 19th International Conference on Intelligent Engi-neering Systems pp 213ndash217 Bratislava Slovakia September2015

[28] B A Akram A H Akbar B Wajid O Shafiq and A ZafarldquoLocSwayamwar finding a suitable ML algorithm for wi-fifingerprinting based indoor positioning systemrdquo in LectureNotes in Electrical Engineering A Boyaci A Ekti M Aydinand S Yarkan Eds vol 504 Springer Singapore 2018

[29] L Breiman ldquoRandom forestsrdquo Machine Learning vol 45no 1 pp 5ndash32 2001

[30] K Kaji and N Kawaguchi ldquoDesign and implementation ofwifi indoor localization based on Gaussian mixture model andparticle filterrdquo in Proceedings of 2012 International Conferenceon Indoor Positioning and Indoor Navigation (IPIN) pp 1ndash9Sydney Australia November 2012

[31] C Wu Z Yang Y Liu and W Xi ldquoWILL wireless indoorlocalization without site surveyrdquo IEEE Transactions on Par-allel and Distributed Systems vol 24 no 4 pp 839ndash848 2013

[32] S Yang P Dessai M Verma and M Gerla ldquoFreeLoccalibration-free crowdsourced indoor localizationrdquo in Pro-ceedings of IEEE INFOCOM pp 2481ndash2489 Turin Italy April2013

[33] T Seidl and H-P Kriegel ldquoOptimal multi-step k-nearestneighbor searchrdquo ACM SIGMOD Record vol 27 no 2pp 154ndash165 1998

[34] W S Mcculloch and W Pitts ldquoA logical calculus nervousactivityrdquo Bulletin of Mathematical Biology vol 52 no l-2pp 99ndash115 1990

[35] J G Cleary and L E Trigg ldquoKlowast an instance-based learnerusing an entropic distance measurerdquo in Proceedings of TwelfthInternational Conference on Machine Learning vol 5 pp 1ndash14 Tahoe City CA USA July 1995

[36] J Huhn and E Hullermeier ldquoFURIA an algorithm for un-ordered fuzzy rule inductionrdquo Data Mining and KnowledgeDiscovery vol 19 no 3 pp 293ndash319 2009

[37] S Ciss Generalization Error and Out-of-bag Bounds inRandom (Uniform) Forests PhD thesis French UniversityParis France 2015

Mobile Information Systems 11

Computer Games Technology

International Journal of

Hindawiwwwhindawicom Volume 2018

Hindawiwwwhindawicom

Journal ofEngineeringVolume 2018

Advances in

FuzzySystems

Hindawiwwwhindawicom

Volume 2018

International Journal of

ReconfigurableComputing

Hindawiwwwhindawicom Volume 2018

Hindawiwwwhindawicom Volume 2018

Applied Computational Intelligence and Soft Computing

thinspAdvancesthinspinthinsp

thinspArtificial Intelligence

Hindawiwwwhindawicom Volumethinsp2018

Hindawiwwwhindawicom Volume 2018

Civil EngineeringAdvances in

Hindawiwwwhindawicom Volume 2018

Electrical and Computer Engineering

Journal of

Journal of

Computer Networks and Communications

Hindawiwwwhindawicom Volume 2018

Hindawi

wwwhindawicom Volume 2018

Advances in

Multimedia

International Journal of

Biomedical Imaging

Hindawiwwwhindawicom Volume 2018

Hindawiwwwhindawicom Volume 2018

Engineering Mathematics

International Journal of

RoboticsJournal of

Hindawiwwwhindawicom Volume 2018

Hindawiwwwhindawicom Volume 2018

Computational Intelligence and Neuroscience

Hindawiwwwhindawicom Volume 2018

Mathematical Problems in Engineering

Modelling ampSimulationin EngineeringHindawiwwwhindawicom Volume 2018

Hindawi Publishing Corporation httpwwwhindawicom Volume 2013Hindawiwwwhindawicom

The Scientific World Journal

Volume 2018

Hindawiwwwhindawicom Volume 2018

Human-ComputerInteraction

Advances in

Hindawiwwwhindawicom Volume 2018

Scientic Programming

Submit your manuscripts atwwwhindawicom

Page 2: CEnsLoc: Infrastructure-Less Indoor Localization ...downloads.hindawi.com/journals/misy/2018/3287810.pdf · 4.Proposed Localization Methodology 4.1. Problem Formulation. We assume

because of variability in channel behavior including fadingand shadowing and also due to heterogeneity of device typesand form factors [12]

Wi-Fi networks are deployed as in Access Points (APs)that are prevalent everywhere Utilizing these to captureRSSI fingerprint (FP) is conveniently possible with device assimple as smartphones or phablets Wi-Fi fingerprint-basedlocalization has the following benefits no requirement ofextra hardware at both sender and receiver sides utilizationof already existent infrastructure easily implementable andno essential need of propagation model building which mayor may not depict real signal propagation at run time [13]

(e infrastructure of APs allows us to collect a datasetcomprising FPs on selected Reference Points (RPs) thatessentially becomes a cue to the physical layout of thebuilding like a map It is then utilized to prepare the lo-calization system as in the training phase Once trained thesystem is ready to be used ie for an unseen FP capturedanew a room number or an associated label is returned bythe system to estimate the location

In this paper a new localization methodology is pre-sented that uses a combination of data reduction techniqueas in principal component analysis (PCA) soft clusteringtechnique such as Gaussian mixture model (GMM) andbootstrapped aggregatedbagged ensembles of decision treescommonly referred to as Random Forest Ensembles (RFEs)We aspire to provide infrastructure-less indoor localizationmethodology which is scalable easy to deploy and providesreal-time response for high room-level accuracy instead ofexplicit coordinates First of all PCA was employed for rawdata dimension reduction (en clustering was performedto split the data in similar groups to help classifier betterlearn the data dynamics Finally a separate RFE is trained forevery single cluster

(e remainder of the paper has been organized as fol-lows Section 2 presents related work Preliminary experi-mentation results are summarized in Section 3 Section 4provides details of the localizationmethodology that we haveproposed Section 5 delves into experimental setup layoutand results for validation Conclusion along with possiblefuture directions is showcased in Section 6

11 Scope of the Study It is important to declare the scope ofthe study here to give a prelude to the paper that follows(epaper is aimed at proposing a newmethodology that is put toapplication in a practical and particular environment (eresulting constraints and limitations of our experimentalregime are quite natural and fairly generalizable in terms ofspatiotemporal aspects specifically for building size andtype fingerprintsrsquo collecting device type and time of theyear (e fingerprints were collected at the Software Engi-neering Centre of our own university (e building isdouble-storey and has architectural diversity in terms ofrooms corridors and an inlaying garden Nonetheless anextensive dataset spanned over multiple buildings can beobtained to ensure wider spatial diversity Our dataset wascollected using a single Android phone however witha large team using a variety of devices data collection can be

performed over different times of the year to obtain databoth for training and testing of our approach

2 Related Work

We summarize here the work on IPS based on Wi-Fi that istypically aWLAN standard IEEE80211 (b a g ac or any) ora combination of Wi-Fi with another wireless or sensoryinput RADAR [14] from Microsoftreg labs is the pioneerresearch work to employ Wi-Fi signals Wi-Fi signals re-ceived at the base stations (Access Points) from a laptop wereused for predicting the userrsquos coordinates using k-NN-basedmethod and triangulation reporting a median error of 2-3m Li et al [12] combined affinity propagation as a messagepassing-based clustering algorithm with PSA-based artificialneural network (ANN) for the Cartesian coordinates-basedprediction A mean error as low as 189m was reported bythem including 29m for 90 of estimates

Song et al [13] analyzed FP collection as an AP relevancyproblem Hidden Naıve Bayes (HNB) was used as a mech-anism to infer the most relevant APs and suggested thatredundant APs may be obviated for each RP througha variant of ReliefF with the Pearson product-momentcorrelation coefficient (PPMCC) Moreover clusteringwas performed on RPs One HNB was trained per cluster toapproximate user coordinates Cooper et al [15] employedcombination of Wi-Fi and Bluetooth low energy radio signalFPs based on boosting technique targeting the room-levelprediction (ey trained one classifier per room based ona variant of AdaBoost that conveniently harnessed de-cision stumps in one-vs-all notion Using combination ofWi-Fi + BLE they acquired 96 accuracy with 43E minus 03seconds response time Wang et al [16] similar to Li et alpresented training of ANN with back propagation that wasbased on PSA for RSSI measurements of RFID tags (eyperformed data preprocessing by normalizing dataset to [0 1]range along with using Gaussian filter(ey estimated x and ycoordinates reporting a mean error of 034m

Xu et al [17] utilized multilayer neural network (MLNN)for Wi-Fi signals along with network boosting (ey trainedtheMLNN in two stages commonly followed in deep learningnamely pretraining using autoencoders and fine tuning usingback propagation algorithm reporting a mean error of 109mZhang et al [18] proposed a coarse localizer composed offour-layered deep neural network using stacked denoisingautoencoders succeeded by the hidden Markov model-basedfine localizer reporting a mean error of 039m

Calderoni et al [19] utilized RFID tagsrsquo RSSI valuestargeting room-level accuracy in a hospital environment(ey divided the total area into macroregions using k-meansvariant followed by a Random Forest trained per macro-region Multiple random forests for whom the clustermatching score was greater than a particular threshold de-termined the final prediction with 83 reported accuracyJedari et al [20] investigated room-level prediction using k-NN rule-based JRip and Random Forest classifier based onWi-Fi signals (ey concluded that Random Forest producedmuch better results than k-NN (774) and JRip (722) with913 accuracy

2 Mobile Information Systems

Mo et al [21] proposed the usage of kernel PCA (KPCA)algorithm for the coarse-level prediction of manually la-belled cluster using Random Forest (ey derived trainedmatrices from extracted KPCA features and prepared sub-radio maps For prediction the features extracted fromcoarse positioning refined by the trained KPCA matriceswere fed to weighted k-NN (WK-NN) for final coordinatesestimation (ey reported an accuracy of 93 with an errordistance of 2m Gorak et al [22] employed Random Forestfor finding important APs and applied threshold-basedelimination (ey determined malfunctioning APs duringoperation based on important APs (ey were able to reporterror rates as low as 4 for detecting the floor and as forhorizontal detection 2m error was reported (e resultswere compared against 30 and 7m respectively whenmalfunctioning AP were undetected (ey [23] divided FPdataset both into subsets which were either overlappingandor nonoverlapping as per the presence of RSSI fromevery AP Furthermore one Random Forest was trained foreach such subset (ey compared the results with baseRandom Forest signifying around 5 to 9 betterment inaverage reported error at the floor level (e performance interms of detection of floors was unchanged

Aforementioned are some recent efforts in field of in-door localization using RF signals k-NNworks by storing allthe data samples along with marked ground truth labels Anunseen sample is compared with complete dataset based ona similarity measuredistance to determine nearest neigh-bors where k is the number of neighbors and neighborsrsquoweightage(e final decision of the samplersquos class is based onthe majority of the labels of the k-nearest neighbors SeveralIPS such as in Oussalah et al [11] and Niu et al [24] based onk-NN and its variants do not scale well when the datasetgrows because they require FP matching with whole datasetMoreover recently artificial neural networks and deeplearning have gained a great deal of focus for indoor lo-calization Artificial neural networks try to mimic humanbrain(ey formmultiple layers of neurons namely a singleinput layer a single output layer and one or more hiddenlayers At every layer several neurons are connected to oneanother according to a specific configuration and triggeringfunction Every layer affects and triggers neurons in the nextlayer following rules of the learning function which even-tually evolves into the final output Heavy resource utili-zation is required during the training phase however theirresponse time is negligible due to minimal required com-putation IPS employing ANNs and deep learning such as Liet al [12] Ding et al [25] Leon et al [26] Zhang et al [18]and Tuncer and Tuncer [27] faces the challenge of findingseveral tunable parameters such as the optimal architectureno of layers learning function and no of neurons at everylayer Moreover the convergence rate and the final accuracyof various configurations do not follow any specific trendSometimes a 2-layer simpler configuration takes more timeto converge than a 4-layer network and the accuracyachieved by a 6-layer network is lower than the accuracyobtained by a 3-layer ANN Hence heuristics are pre-dominantly used for proposing an architecture based onANNs because Monte Carlo configuration testing is

impossible An AP location change or the additionremovalof newAPs will lead to essential retraining of the IPS puttingANN and deep learning at a disadvantage

Inspired from existing works we suggest a clustering-based multiclass classifier approach for room-level pre-diction which is easy to train-and-deploy in terms ofcomputational complexity provides suitable accuracy andoffers response time appropriate for real-time applicationsClustering follows the divide and conquer approach to helpthe classifier better learn the group of similar observationsinstead of the whole dataset We perform soft classification(clustering) of FPs not for RPs which has been mostly donein the existing works PCA-based data dimension reductionhelps us to reduce the response time of the system by de-creasing the number of predictors

(is approach scales well in terms of response time withan increase in the number of rooms since a single classifier isinvoked for each location providing maximum accuracyreported so far to the best of our knowledge

3 Preliminary Experiments

Preliminary experiments were performed on sample datasetcollected from our departmental building to evaluate clas-sifier suitability [28] (e results presented here are on thesample dataset (e detailed experimentation results on thecomplete dataset of all locations in building are presented inSection 5

31 Dataset Acquisition A customized app was developedfor an Android phone built to record RSSI as vector datacoming fromWi-Fi APs(eWi-Fi FPs of all observable APsboth within 24 and 5GHz bands at a RP were scanned usinga commercial off-the-shelf Samsung phone (Version J5Galaxy) FPs were obtained at each RP while hovering thesmart phone starting at 0deg up to right angle (90deg) with respectto the floor as shown in Figure 1 so as to make the phoneface N NW W SW S SE E and NE with an effort to keepocclusion as minimum as possible due to the human torso[5 9] (e user stood at the centre of the RP held the phoneused for FP collection in his hand and capturedmultiple FPsin each direction out of the total 8 directions shown inFigure 1 A total of A APs were present in the premisescontinuously emitting radio signals (ese FPs were thenstored into DB each with respective room labels

32 Preprocessing (e resultant dataset was found to besparsely populated with the identifiers of APs because of thepresenceabsence of these APs at various RPs labelled inrooms and in the corridors of building (e measured RSSIvalues varied between minus98 dBm and minus15 dBm (from beingweak to being strong as a result of distance from APs) Beingconsistent with the well-known practice to keep the missingvalues slightly weaker than the weakest signal detected in thedataset [12 14 18 19] the missing values were replaced withminus100 dBm

Mobile Information Systems 3

33 Classier Evaluation We evaluated performance of 60+classiers in WEKA for room-level prediction on sampledataset out of which top ten best performing classierperformances are summarized in Table 1

Taking into account all the performance measures suchas accuracy to receiver operating characteristics (ROC) areathe overall performance best attained (descending order)was by Klowast k-nearest neighbors (k-NN) Random ForestEnsemble (RFE) and algorithm for Fuzzy Unordered RuleInduction Algorithm (FURIA) multilayer perceptron deeplearning for JAVA (Dl4jMlpClassier) support vectormachines Naive Bayes classier and nally AdaBoost thatuses stumps for decision-making Klowastand k-NN both areinstance-based classiers and produced similar performanceresults Followed by RFE FURIA and multilayer perceptron(ANN) FURIA and ANN show almost similar performancetrend We selected k-NN and ANN for comparison as manyexisting works had utilized these which makes comparisonwith other works easier Moreover we chose Klowast FURIAand DeepLearning4J classiers for comparison too as theyare state-of-the-art and relatively new machine learningmethods RFE is suited for large datasets to give high ac-curacy and time eciency It is resilient to noise in data andis also capable of dealing with missing values in data RFEutilizes bootstrapping that reduces variance and keeps thebias in check because creating dierent subsets of trainingdataset along with a replacement mechanism ensures thatthe trees have little or no correlationerefore overtting isavoided making it more generalizable Both trainingand prediction time of RFE due to parallel computationsupported by bagging make it suited for real-time imple-mentation of IPS Hence we selected RFE [29] as the suitableclassier module in our proposed methodology

4 Proposed Localization Methodology

41 Problem Formulation We assume localizationpositioning as a combination of clustering and multiclass

classication problem where each room is considered to bea class A two-dimensional indoor area is partitioned into Rsquare grids of dimensions CtimesDm2 e centre of eachsquare grid is a reference point (RP) A device equipped withthe wireless adapter card can sense wireless signals froma total of A AP-s at a certain RP at a given time which formsthe ngerprint FPi RVi Li RVi rssii1 rssii2 rssii3 rssiiA where rssiij symbolizes the RSSI value from jth AP(dBm) in the ith sample of FP collected and Li is the re-spective classroom label Let N such FP constitute thedataset Localization function LF is learnt from the FPdataset to map the observed FP to a certain room label Lx asdescribed by the following equation

Lx LF FPi( ) (1)

ere are two phases of the proposed methodology(CEnsLoc) namely training phase and prediction phase asshown in Figure 2 First of all a sparse Wi-Fi FP dataset withmany missing values is collected In the training phase thecollected FPs reserved for training the system are pre-processed for missing values replacement followed by PCAapplication for dimension reduction en GMM-basedhard clustering is used for nonoverlappingdisjoint datasubsets generation For each such subset a separate RFE istrained for location prediction and stored in the database Inthe location prediction phase same steps of missing valuereplacement and PCA computation are performed on thecaptured FP en FP is matched with a single cluster usingthe stored GMM e nal prediction Lx is generated byinvoking the respective pretrained RFE for the best matchedclustersubset ese phases are formally elaborated in detailin Sections 42 and 43 respectively

42 Training Phase During training the training dataset isfed to the preprocessing module which replaces emptyreadings with missing value replacement (MVr) PCA is thenperformed on the dataset for dimension reduction Or-thogonal transformation is applied by PCA for redundantinformation removal to decrease the number of predictorsPrincipal components (PCs) were obtained by applyingPCA which are a set of linearly uncorrelated variables suchthat maximum variance by some projection of the data iscaptured by the rst principal component and so onChoosing the smaller of the number of predictorsAPs andnumber of samples minus one A PCs are generated PC1PC2 PCA For computation of PCs rst the mean RSSIvalue of each AP is subtracted from the ith RP using thefollowing equation

Xi 1NsumN

FPi minus1R

1NsumR

sumN

FPi (2)

where N is the total no of rows of samplesdataset and R isthe total no of RPs e PCs matrix is computed by thefollowing equation

PCi Xi times EA (3)

where EA is the eigenvector matrix of the average RSSI valueof each AP in the ith RPe resulting dataset is then divided

SE

N

S

EW

NENW

SW

AP2

AP1

AP3

APA

Figure 1 User orientation during data collection

4 Mobile Information Systems

into subsets using GMM clustering (K data subsets)Equation (4) is a distribution based on 2D Gaussian wheremean is represented by micro and the covariance matrix is sum AGMM with N as the no of overlapping distributions isdescribed by the following equations

N(x | μΣ) 1

2π|Σ|

radic exp minus12(xminus μ)TΣminus1(xminus μ) (4)

P(x) sumN

k1πkN x μkΣk

∣∣∣∣( ) (5)

sumN

k1πk 1 (6)

where πk denes the mixing coecient to express the weightof each mixing element (weighted sum being 1) e re-sultant shape of 2D Gaussian is the average of distributionsindividually in terms of covariance and mixing coecientsAssuming that a linear mix of weighted coecients for eachof the respective distributionrsquos average and covariance isobtained and by incorporating sucient distributionsa nal density function may be obtained e reason behindGMM clustering was the similarity between Gaussian dis-tribution and the radio propagation characteristics of a Wi-Fi AP [30] which makes GMM a highly suitable candidatefor clustering Wi-Fi RSSI vectors

Furthermore each data subset is used for training a RFEto predict the room label as a multiclass classier etrained models of GMM as well as all RFEs are stored forlater use in the prediction phase

e algorithm for training and prediction phases ofCEnsLoc is given in Algorithm 1

Let the total number of samples beNsample in the trainingset the number of trees is Ntree the number of maximumsplits allowed is Smax the number of predictors for theclassier is Arsquo f is the value specifying no of input predictorsthat are utilized to split at a tree node and tc denotes thetotal no of classes in the dataset For nding best split RFEuses Gini Index as given in the following equation where Pj isthe class jrsquos relative frequency in Nsample

Gini Nsample( ) 1minus sumtc

j1Pj( )

2 (7)

RFE is trained using the method presented inAlgorithm 2

43 PredictionPhase e average of collected RSSI FP is fedto CEnsLoc A PCs are computed by the method similar tothe training phase e saved model of GMM is invoked forcluster matching Matched clusterrsquos trained RFE is invokedwhere the nal decision is computed by the majority votedescribed in Algorithm 2

44 Time Complexity of Training and Prediction Phases forCEnsLoc Ceteris paribus the time complexity of thetraining phase and the prediction phase for CEnsLoc isessentially dependent upon the size of the experimental area

One RFE trained for each cluster

subset

Invokepretrainedrespective

RFE

Wi-Fi RSSI FPswith room labels

Estimated room

Training phasePrediction phase

AP1AP2AP3APA

Fingerprint(s)

RSSI1

RSSI2

RSSI3

RSSIA

Li

Lx

Wi-Fi FPs collection Run time observed Wi-Fi FP

PCA

Preprocessing

Generate data subsets

(GMM clustering)

Cluster matching

Cluster selection

Figure 2 Proposed localization methodology (CEnsLoc)

Table 1 A comparison of performance measures of various classication algorithms for Wi-Fi-based position estimation

Algorithm Time to build model (sec) Accuracy Kappa statistic RMSE Precision Recall F1 MCC ROC areaKlowast 0 9952 098 002 099 099 099 099 1k-NN 0 9906 099 003 099 099 099 098 099RFE 111 9876 098 004 098 098 098 098 1FURIA 592 9726 096 005 097 097 097 096 099Multilayer perceptron 2584 9705 096 006 097 097 097 096 099J48 01 9591 095 008 095 095 095 095 098Dl4jMlp classier 2631 94 093 008 094 094 094 093 099SVM 582 906 089 012 093 090 090 09 094Naive Bayes 002 8979 088 012 091 089 089 089 099AdaBoost with decision stump 01 3681 022 025 015 036 021 018 076

Mobile Information Systems 5

our acquisition regimen the resultant dataset of FPs andhow the dataset is manipulated by the tandem of schemes weemploy

441 Time Complexity of Training In the training phase forPCA time complexity is governed by the following equation

O min A3 N

31113872 11138731113872 1113873 (8)

where A no of predictors and N no of observationsFor training a decision tree (DT) that has not been

pruned the expression is as follows

O(A times N log(N)) (9)

As RFE consists of numerous DTs merely a small no f isused from total predictors A Complexity for a single DT inRFE is represented by Equation (10) and the complexity ofNtree by Equation (11)

O(f times N log(N)) (10)

O Ntree times f times N log(N)( 1113857 (11)

whereNtree no of trees in RFE and f random features thatare chosen to get the best split

While trying to control treesrsquo depth grown using Smaxtraining complexity of one RFE is as follows

O Ntree times f times N times Smax( 1113857 (12)

SinceK ensembles are grown for predicting room level timeto train complexity is represented by the following equation

O Ntree times f times N times Smax times K( 1113857 (13)

For GMM the complexity is expressed by the followingequation

O N times K times D3

1113872 1113873 (14)

where N no of samples K no of components andD no of dimensions

Incorporating all the time complexity to train forCEnsLoc is as follows

O min A3 N

31113872 11138731113872 1113873 + O N times K times D

31113872 1113873

+ O Ntree times f times N times Smax times K( 1113857(15)

442 Time Complexity of Prediction For prediction timecomplexity for PCA is given by the following equation

O min A3 N

31113872 11138731113872 1113873 (16)

(e complexity for a DT and an ensemble in terms ofprediction time are shown by Equations (17) and (18)respectively

O(N log(N)) (17)

O Ntree times N log(N)( 1113857 (18)

Smax controls trees depth therefore complexity for anensemble is given by the following equation

O Ntree times N times Smax( 1113857 (19)

Only a single RFE out of K ensembles gets invoked forpredicting a room

(erefore time complexity for GMM is given by thefollowing equation

O K times D3

1113872 1113873 (20)

Finally the overall complexity for CEnsLoc in terms ofprediction is as follows

O min A3 N

31113872 11138731113872 1113873 + O K times D

31113872 1113873 + O Ntree times N times Smax( 1113857

(21)

5 Experimental Results and Discussion

(is section entails hardware equipment software used andparticulars of experiments that were used to evaluate theperformance of CEnsLoc in light of accuracy precisionrecall training and response time

An Intel machine (64-bit Xeon X5650) with a masterclock at 267 GHz with 24GB RAM and 64-bit Windows 10Education was used for experimentation in MATLAB (ereal dataset was developed through FP collection at theground floor of Software Engineering (SE) Centre Uni-versity of Engineering and Technology (UET) LahorePakistan (e buildingrsquos dimensions are 39m times 31m(1209m2) containing offices class rooms laboratories andopen corridors Figures 3 and 4 depict the buildingrsquos floorplan room labels (L1ndashL10 closed rooms L12 open corri-dor and L11 a semiopen room) a total of 180 RPs and the

Input training dataset with total A predictorsMissing value replacement MVrMaximum number of clusters Kmax

Output predicted location LxFor trainingReplace empty values with MVrApply PCA on dataset to generate Arsquo predictorsFor k 1 -gtKmaxGenerate clustersGenerate and save k data subsetsFor each p є k data subsetsTrain p RFE using Algorithm 2 (training)Calculate performance measures

End forEnd forChoose optimal configurationSave respective models for GMM all RFEsFor prediction at a new point xReplace missing values with MVrApply PCA on the FPMatch one cluster CmatchInvoke RFE of Cmatch using Algorithm 2 (prediction)

ALGORITHM 1 CEnsLoc training and prediction algorithm

6 Mobile Information Systems

number of samples collected per room RPs that are rep-resented by small coloured dots in rooms in Figures 3 and 4enlist the total number of samples collected at each locationL1ndashL12 Such notion of rooms was used as walls playinga crucial role in fluctuation of Wi-Fi RSSI values [31 32](e area was planned into a grid of cells of 15times15m2Each cell center was marked as the Reference Point (RP) forFP collection

(e complete dataset consisted of 20087Wi-Fi RSSI FPsin which total 40APs were detected Figure 4 depicts thenumber of FPs captured in each roomlocation marked asL1ndashL12 It must be noted that all these APs belong touniversity infrastructure comprising of SE centre and itsimmediate neighboring buildings (e FPs were pre-processed following the same process described in Sections31 and 32 PCA-based dimension reduction was employedwith optimal results found with 23 PCs(e resultant datasetwas divided with 70 30 stratified ratio for training andtesting subsets (e experimental results are discussed usingboth 10-fold cross validation (10-CV) on training subset andon unseen 30 test subset GMM clustering then partitionedthe training data into further subsets and the optimalconfiguration was two clusters shared covariance kept astrue and diagonal covariance Furthermore one RFE wastrained for each subset with 132 trees 1024 maximum splitsand 8 random features CEnsLoc was hosted at a machineand the mean of observed RSSI FPs was used for run timelocation estimation after applying the same model of PCAand GMM cluster matching finalized from the trainingphase Best matched clusterrsquos respective RFE was invoked forroom prediction Various companion apps can query lo-cation of subscribed users using the IPS configured asa server Response time was computed as an average of thedifference between localization query time and predictiongeneration time (e following formulae were used tocompute the performance parameters

accuracy TP + TN

TP + TN + FP + FN

precision TP

TP + FP

recall TP

TP + FN

(22)

51 Classification Effectiveness and Efficiency Tables 2 and 3summarize the 10-CV performance evaluation of CEnsLoccomparing it with k-NN [33] artificial neural network(ANN) [34] Klowast [35] FURIA [36] and DeepLearning4J [36](e best results are highlighted throughout with boldface k-NN results were computed averaged over six differentconfigurations based on the number of neighbors similaritymeasure employed and neighbor weightage Similarly sixdifferent configurations of ANN namely 2- 3- and 4-layerANNs each having varying number of neurons per layeremploying two learning algorithms SCG and RBP wereaveraged out to obtain the results For Klowast the entropic blendpercentage was varied from 10 to 90 percent (e resultsobtained for FURIA were obtained by varying the count offolds for growth and pruning as well as through varyingminimum instances weight that was for each split Deep-Learning4J results were obtained by varying the number ofneurons per layer number of dense layers in the networkand training algorithm out of many possible variations in thehyper parameters(e results by CEnsLoc were generated bythe found optimal configuration of our proposed approachFor all the approaches the same set of configurations asused during 10-CV performance evaluation were used forresult generation on 30 unseen stratified test data subsetwhose results are presented in Table 4(e best performanceon both 10-CV results (Table 2) and test dataset (Table 4)

Input data subset with total Arsquo predictors for trainingNo of tree NtreeAllowed maximum no of splits SmaxRandom no of predictorsfeatures f

Output estimated location LxFor system trainingStep 1 for l 1 to Ntree

(i) Choose a bootstrap sample set (SS) of size (Nsample) with replacement from the training data subset(ii) Generate a Random Forest Tree (Tl) to SS via recursively iterating (a-c) for every terminal node of tree unless the maximum no of

splits (Smax) is reached

(a) Randomly select f featuresvariables from the Arsquo predictors (fltltArsquo)(b) Choose the best featuressplit-point from the f employing Gini Index(c) Split node forming into two children nodes

Step 2 Produce the resulting ensemble of trees Tl1Ntree

For location prediction at a new point x from RFE LrfNtreeLet Lm(x) be the roomclass prediction by the mth RFE treeLrfNtree(x)maj vote Lm(x)1Ntree

ALGORITHM 2 RFE classification algorithm for training and prediction

Mobile Information Systems 7

were obtained by CEnsLoc with 97 and 95 accuracyfollowed by FURIA (92 90) k-NN (91 89) Klowast (9087) ANN (85 82) and DeepLearning4J (73 71)respectively e results are presented on both 10-CV andunseen test data subset to show that for a small dataset 10-CV results can provide a good performance estimate asresults on the test data subset were found to be showingsimilar trends as depicted by the 10-CV results Althoughdeep leaning and ANN provide good performance resultsbut nding optimal conguration for a huge number oftunable parameters is a tricky task Moreover during ex-periments it was found that despite generic guidelines forparameter tuning the performance measures and conver-gence rate of ANN as well as deep learning schemes highlynotuctuate with even a slight variation in the number ofneurons in layers training algorithm and number of hiddenlayers which makes it harder to nd optimal congurationfor ANN and deep learning models Lazy and instance-basedapproaches such as k-NN and Klowast are also good candidates

for indoor localization however they have a very limitednumber of tunable parameters resulting in inability tosurpass CEnsLoc performance despite trying dierent pa-rameter combinations ey do not generalize well ascompared to RFE as the end result of prediction is heavilydependent on the majority vote by k closest matchedsamples in the dataset ey also need template matchingwith the entire dataset for one location prediction whichresults in growing response time with increasing numberof samples in the dataset which is highly likely in practicalreal-world scenarios As in a typical building there area quite huge number of visible APs and a suciently largenumber of samples are also required for classiers to workproperly

A minimum response time of 205Eminus 05 seconds wasobtained by FURIA DeepLearning4J stood second with682Eminus 05 seconds response time ANN k-NN and CEn-sLoc all had response time on the scale of Eminus 04 secondswhich is 10 times slower than aforementioned two

L1 L2 L3

L4

L5

L6

L7

L8

L10L9 L12

L11

L12

L12

L12

L12

Figure 3 Room labels and marked RPs

446 446717

2700 2700 2700 2700

1446

250743 743

4496

0500

100015002000250030003500400045005000

Number of samplesLocations

Num

ber o

f sam

ples

L1L2L3

L4L5L6

L7L8L9

L10L11L12

Figure 4 Number of samples per room

8 Mobile Information Systems

approaches but the difference was trifling which cannot bedetected by any human user of the system the accuracyprecision and recall provided by CEnsLoc was much greaterthan all other approaches

FURIA stands out as the second best performer re-garding indoor localization which is an upgraded version ofthe RIPPER algorithm indicating that rule-based algo-rithms specially fuzzified versions with good generalizationcapabilities perform well for location estimation as wellHowever it lagged behind CEnsLoc in terms of accuracyprecision and recall by 5 10 and 15 respectively

(e details of both 10-CV and test dataset results for allIPS compared and CEnsLoc are depicted in Figure 5 for sideby side visual comparison

52 Out-of-Bag (OOB) Error Results (ere is a performancemeasure called OOB error which is peculiar to RFE Duringtraining of RFE data subsets are generated with re-placement resulting in some repeated and left out obser-vations (OOB observations) for each tree (at particulartree does not train on these left out OOB observationsPrediction capability of RFE can be measured using ldquoOOBLossrdquo which is the error made on unseen OOB observations

during training OOB loss measure has been investigatedand shown to provide an upper bound on testing error [37]specifically useful for small-sized datasets Hence OOBerror can be used just likeinstead of unseen test data subsetif the available dataset is small or unseen test dataset isunavailable It provides a very good estimate of the trainedclassifierrsquos generalization capability Table 5 summarizes theOOB loss compared with averaged out 10-CV loss indicatingthat it indeed bounds it

6 Conclusion and Future Work

Location predictionestimation provides derivation ofmeaningful context for a broad range of services and ap-plications Indoor localization can open altogether a pleth-ora of new opportunities because humans spend most oftheir time indoors CEnsLoc offers shorter response timeand an overall improvement in accuracy precision andrecall With only a few parameters to be tuned it is suited forFP-based localization which requires frequent recollection ofdata and retraining CEnsLoc was able to attain 97 ac-curacy in comparison with other IPS averaged over 6 dif-ferent configurations namely FURIA k-NN Klowast ANN andDeepLearning4J with 92 91 90 85 and 73 ac-curacies respectively It can be utilized for elderly assistancenavigation smart buildings and smart transportation toname a few potential applications

Our future work includes deployment of CEnsLoc acrossa wide range of civil infrastructures including different floorsandor buildings to understand its performance in moredetail as well as its scalability using crowdsourcing We alsoaim to build safety security and evacuation guide appli-cations with CEnsLoc at their core for users in officesuniversities and retail Furthermore integration with GPSBluetooth and PDR to further enhance accuracy utilizing

Table 2 Tenfold cross-validated performance evaluation and comparison of CEnsLoc

Accuracy Precision Recallk-NN 091 088 087ANN 085 083 078Klowast 090 096 062FURIA 092 089 082DeepLearning4J 073 051 041CEnsLoc 097 098 097

Table 3 Tenfold cross-validated performance comparison of CEnsLoc in terms of training and response time (seconds)

Time (sec) Training time (10-fold) Avg training time (1-fold) Response time dataset Response time (1 sample)k-NN mdash mdash 197 170Eminus 04ANN 26720 2672 018 141Eminus 04Klowast mdash mdash 10309 817Eminus 02FURIA 1587 158 003 205E2 05DeepLearning4J 75559 755 009 682Eminus 05CEnsLoc 14007 14007 235 208Eminus 04

Table 4 Performance evaluation and comparison of CEnsLoc ontest subset

Accuracy Precision Recallk-NN 089 087 085ANN 082 081 076Klowast 087 094 060FURIA 090 086 079DeepLearning4J 071 048 039CEnsLoc 095 096 094

Mobile Information Systems 9

available hybrid technologies at a given time is also part ofthe plan

Data Availability

e data used to support the ndings of this study areavailable from the corresponding author upon request

Conflicts of Interest

e authors declare that they have no connoticts of interest

Acknowledgments

is research was supported by Basic Science ResearchProgram through the National Research Foundation ofKorea (NRF) funded by the Ministry of Education (NRF-2018R1D1A1B07048697)

References

[1] J Torres-Sospedra R Montoliu S Trilles O Belmonte andJ Huerta ldquoComprehensive analysis of distance and similaritymeasures for Wi-Fi ngerprinting indoor positioning sys-temsrdquo Expert Systems with Applications vol 42 no 23pp 9263ndash9278 2015

[2] H Mehmood and N K Tripathi ldquoCascading articial neuralnetworks optimized by genetic algorithms and integrated withglobal navigation satellite system to oer accurate ubiquitouspositioning in urban environmentrdquo Computers Environmentand Urban Systems vol 37 no 1 pp 35ndash44 2013

[3] M M Soltani A Motamedi and A Hammad ldquoEnhancingcluster-based RFID tag localization using articial neuralnetworks and virtual reference tagsrdquo Automation in Con-struction vol 54 pp 93ndash105 2015

[4] M L Rodrigues L F M Vieira and M F M CamposldquoFingerprinting-based radio localization in indoor environ-ments using multiple wireless technologiesrdquo in Proceedings ofIEEE International Symposium on Personal Indoor andMobile Radio Communications pp 1203ndash1207 Toronto ONCanada September 2011

[5] J Luo and H Gao ldquoDeep belief networks for ngerprintingindoor localization using ultrawideband technologyrdquo In-ternational Journal of Distributed Sensor Networks vol 12no 1 article 5840916 2016

[6] S Saeedi L Paull M Trentini and H Li ldquoNeural network-based multiple robot simultaneous localization and map-pingrdquo IEEE Transactions on Neural Networks vol 22 no 12pp 880ndash885 2011

[7] M Azizyan R R Choudhury and I Constandache ldquoSur-roundSense mobile phone localization via ambience nger-printingrdquo in Proceedings of the 15th annual internationalconference on Mobile computing and networking-Mobi-Comrsquo09 pp 261ndash272 Beijing China 2009

[8] L Pei R Chen J Liu et al ldquoMotion Recognition AssistedIndoor Wireless Navigation on a Mobile Phonerdquo in Pro-ceedings of the 23rd International Technical Meeting of eSatellite Division of the Institute of Navigation pp 3366ndash3375Portland OR USA September 2010

[9] P S Nagpal and R Rashidzadeh ldquoIndoor positioning usingmagnetic compass and accelerometer of smartphonesrdquo inProceedings of International Conference on Selected Topics inMobile and Wireless Networking (MoWNeT) pp 140ndash145Montreal Canada 2013

[10] J Menke and A Zakhor ldquoMulti-modal indoor positioning ofmobile devicesrdquo in Proceedings of IEEE International Con-ference on Indoor Positioning and Indoor Navigationpp 13ndash16 Alberta Canada October 2015

[11] M Oussalah M Alakhras and M I Hussein ldquoMultivariablefuzzy inference system for ngerprinting indoor localizationrdquoFuzzy Sets and Systems vol 269 pp 65ndash89 2015

[12] N Li J Chen Y Yuan X Tian Y Han and M Xia ldquoA wi-indoor localization strategy using particle swarm optimizationbased articial neural networksrdquo International Journal ofDistributed Sensor Networks vol 12 no 3 article 45831472016

002040608

112

k-N

N

4-la

yer A

NN

(SCG

)

4-la

yer A

NN

(RBP

)

3-la

yer A

NN

(SCG

)

3-la

yer A

NN

(RBP

)

2-la

yer A

NN

(SCG

)

2-la

yer A

NN

(RBP

)

Klowast

FURI

A

Dee

pLea

rnin

g4J

CEns

Loc

k-N

N

4-la

yer A

NN

(SCG

)

4-la

yer A

NN

(RBP

)

3-la

yer A

NN

(SCG

)

3-la

yer A

NN

(RBP

)

2-la

yer A

NN

(SCG

)

2-la

yer A

NN

(RBP

)

Klowast

FURI

A

Dee

pLea

rnin

g4J

CEns

Loc

10-CV Test dataset

PrecisionRecallAccuracy

Figure 5 Performance measures on both 10-CV and test dataset

Table 5 OOB loss

Average 10-fold loss OOB loss00292813 00300123

10 Mobile Information Systems

[13] C Song J Wang and G Yuan ldquoHidden naive bayes indoorfingerprinting localization based on best-discriminating APselectionrdquo ISPRS International Journal of Geo-Informationvol 5 no 10 p 189 2016

[14] P Bahl and V N Padmanabhan ldquoRADAR an in-building RFbased user location and tracking systemrdquo in Proceedings ofIEEE INFOCOM 2000 Conference on Computer Communi-cations Nineteenth Annual Joint Conference of the IEEEComputer and Communications Societies (Cat No00CH37064) vol 2 pp 775ndash784 Tel Aviv Israel March 2000

[15] M Cooper J Biehl G Filby and S Kratz ldquoLoCo boosting forindoor location classification combining Wi-Fi and BLErdquoPersonal and Ubiquitous Computing vol 20 no 1 pp 1ndash142016

[16] C Wang F Wu Z Shi and D Zhang ldquoIndoor positioningtechnique by combining RFID and particle swarmoptimization-based back propagation neural networkrdquo Optik(Stuttg) vol 127 no 17 pp 6839ndash6849 2016

[17] J Xu H Dai and W Ying ldquoMulti-layer neural network forreceived signal strength-based indoor localisationrdquo IETCommunications vol 10 no 6 pp 717ndash723 2016

[18] W Zhang K Liu W Zhang Y Zhang and J Gu ldquoDeepneural networks for wireless localization in indoor andoutdoor environmentsrdquo Neurocomputing vol 194 pp 279ndash287 2016

[19] L Calderoni M Ferrara A Franco and D Maio ldquoIndoorlocalization in a hospital environment using Random Forestclassifiersrdquo Expert Systems with Applications vol 42 no 1pp 125ndash134 2015

[20] E Jedari Z Wu R Rashidzadeh and M Saif ldquoWi-Fi basedindoor location positioning employing random forest clas-sifierrdquo in Proceedings of 2015 International Conference onIndoor Positioning and Indoor Navigation (IPIN) pp 13ndash16Banff Albeta Canada October 2015

[21] Y Mo Z Zhang Y Lu W Meng and G Agha ldquoRandomforest based coarse locating and KPCA feature extraction forindoor positioning systemrdquo Mathematical Problems in En-gineering vol 2014 Article ID 850926 8 pages 2014

[22] R Gorak and M Luckner ldquoMalfunction immune WindashFilocalisationmethodrdquo in Computational Collective Intelligencepp 328ndash337 Springer Cham Switzerland 2015

[23] R Gorak and M Luckner ldquoModified random forest algorithmfor WindashFi indoor localization systemrdquo in Proceedings of In-ternational Conference on Computational Collective Intelligencepp 147ndash157 Cham Switzerland September 2016

[24] J Niu B Wang L Cheng and J J P C Rodrigues ldquoWicLocan indoor localization system based on WiFi fingerprints andcrowdsourcingrdquo in Proceedings of 2015 IEEE InternationalConference on Communications (ICC) pp 3008ndash3013 Lon-don UK June 2015

[25] G Ding Z Tan J Zhang and L Zhang ldquoFingerprintinglocalization based on affinity propagation clustering and ar-tificial neural networksrdquo in Proceedings of IEEE WirelessCommunications and Networking Conference (WCNC)pp 2317ndash2322 Shanghai China April 2013

[26] O Leon J Hernandez-Serrano and M Soriano ldquoSecuringcognitive radio networksrdquo International Journal of Commu-nication Systems vol 23 no 5 pp 633ndash652 2010

[27] S Tuncer and T Tuncer ldquoIndoor localization with bluetoothtechnology using artificial neural networksrdquo in Proceedings ofthe IEEE 19th International Conference on Intelligent Engi-neering Systems pp 213ndash217 Bratislava Slovakia September2015

[28] B A Akram A H Akbar B Wajid O Shafiq and A ZafarldquoLocSwayamwar finding a suitable ML algorithm for wi-fifingerprinting based indoor positioning systemrdquo in LectureNotes in Electrical Engineering A Boyaci A Ekti M Aydinand S Yarkan Eds vol 504 Springer Singapore 2018

[29] L Breiman ldquoRandom forestsrdquo Machine Learning vol 45no 1 pp 5ndash32 2001

[30] K Kaji and N Kawaguchi ldquoDesign and implementation ofwifi indoor localization based on Gaussian mixture model andparticle filterrdquo in Proceedings of 2012 International Conferenceon Indoor Positioning and Indoor Navigation (IPIN) pp 1ndash9Sydney Australia November 2012

[31] C Wu Z Yang Y Liu and W Xi ldquoWILL wireless indoorlocalization without site surveyrdquo IEEE Transactions on Par-allel and Distributed Systems vol 24 no 4 pp 839ndash848 2013

[32] S Yang P Dessai M Verma and M Gerla ldquoFreeLoccalibration-free crowdsourced indoor localizationrdquo in Pro-ceedings of IEEE INFOCOM pp 2481ndash2489 Turin Italy April2013

[33] T Seidl and H-P Kriegel ldquoOptimal multi-step k-nearestneighbor searchrdquo ACM SIGMOD Record vol 27 no 2pp 154ndash165 1998

[34] W S Mcculloch and W Pitts ldquoA logical calculus nervousactivityrdquo Bulletin of Mathematical Biology vol 52 no l-2pp 99ndash115 1990

[35] J G Cleary and L E Trigg ldquoKlowast an instance-based learnerusing an entropic distance measurerdquo in Proceedings of TwelfthInternational Conference on Machine Learning vol 5 pp 1ndash14 Tahoe City CA USA July 1995

[36] J Huhn and E Hullermeier ldquoFURIA an algorithm for un-ordered fuzzy rule inductionrdquo Data Mining and KnowledgeDiscovery vol 19 no 3 pp 293ndash319 2009

[37] S Ciss Generalization Error and Out-of-bag Bounds inRandom (Uniform) Forests PhD thesis French UniversityParis France 2015

Mobile Information Systems 11

Computer Games Technology

International Journal of

Hindawiwwwhindawicom Volume 2018

Hindawiwwwhindawicom

Journal ofEngineeringVolume 2018

Advances in

FuzzySystems

Hindawiwwwhindawicom

Volume 2018

International Journal of

ReconfigurableComputing

Hindawiwwwhindawicom Volume 2018

Hindawiwwwhindawicom Volume 2018

Applied Computational Intelligence and Soft Computing

thinspAdvancesthinspinthinsp

thinspArtificial Intelligence

Hindawiwwwhindawicom Volumethinsp2018

Hindawiwwwhindawicom Volume 2018

Civil EngineeringAdvances in

Hindawiwwwhindawicom Volume 2018

Electrical and Computer Engineering

Journal of

Journal of

Computer Networks and Communications

Hindawiwwwhindawicom Volume 2018

Hindawi

wwwhindawicom Volume 2018

Advances in

Multimedia

International Journal of

Biomedical Imaging

Hindawiwwwhindawicom Volume 2018

Hindawiwwwhindawicom Volume 2018

Engineering Mathematics

International Journal of

RoboticsJournal of

Hindawiwwwhindawicom Volume 2018

Hindawiwwwhindawicom Volume 2018

Computational Intelligence and Neuroscience

Hindawiwwwhindawicom Volume 2018

Mathematical Problems in Engineering

Modelling ampSimulationin EngineeringHindawiwwwhindawicom Volume 2018

Hindawi Publishing Corporation httpwwwhindawicom Volume 2013Hindawiwwwhindawicom

The Scientific World Journal

Volume 2018

Hindawiwwwhindawicom Volume 2018

Human-ComputerInteraction

Advances in

Hindawiwwwhindawicom Volume 2018

Scientic Programming

Submit your manuscripts atwwwhindawicom

Page 3: CEnsLoc: Infrastructure-Less Indoor Localization ...downloads.hindawi.com/journals/misy/2018/3287810.pdf · 4.Proposed Localization Methodology 4.1. Problem Formulation. We assume

Mo et al [21] proposed the usage of kernel PCA (KPCA)algorithm for the coarse-level prediction of manually la-belled cluster using Random Forest (ey derived trainedmatrices from extracted KPCA features and prepared sub-radio maps For prediction the features extracted fromcoarse positioning refined by the trained KPCA matriceswere fed to weighted k-NN (WK-NN) for final coordinatesestimation (ey reported an accuracy of 93 with an errordistance of 2m Gorak et al [22] employed Random Forestfor finding important APs and applied threshold-basedelimination (ey determined malfunctioning APs duringoperation based on important APs (ey were able to reporterror rates as low as 4 for detecting the floor and as forhorizontal detection 2m error was reported (e resultswere compared against 30 and 7m respectively whenmalfunctioning AP were undetected (ey [23] divided FPdataset both into subsets which were either overlappingandor nonoverlapping as per the presence of RSSI fromevery AP Furthermore one Random Forest was trained foreach such subset (ey compared the results with baseRandom Forest signifying around 5 to 9 betterment inaverage reported error at the floor level (e performance interms of detection of floors was unchanged

Aforementioned are some recent efforts in field of in-door localization using RF signals k-NNworks by storing allthe data samples along with marked ground truth labels Anunseen sample is compared with complete dataset based ona similarity measuredistance to determine nearest neigh-bors where k is the number of neighbors and neighborsrsquoweightage(e final decision of the samplersquos class is based onthe majority of the labels of the k-nearest neighbors SeveralIPS such as in Oussalah et al [11] and Niu et al [24] based onk-NN and its variants do not scale well when the datasetgrows because they require FP matching with whole datasetMoreover recently artificial neural networks and deeplearning have gained a great deal of focus for indoor lo-calization Artificial neural networks try to mimic humanbrain(ey formmultiple layers of neurons namely a singleinput layer a single output layer and one or more hiddenlayers At every layer several neurons are connected to oneanother according to a specific configuration and triggeringfunction Every layer affects and triggers neurons in the nextlayer following rules of the learning function which even-tually evolves into the final output Heavy resource utili-zation is required during the training phase however theirresponse time is negligible due to minimal required com-putation IPS employing ANNs and deep learning such as Liet al [12] Ding et al [25] Leon et al [26] Zhang et al [18]and Tuncer and Tuncer [27] faces the challenge of findingseveral tunable parameters such as the optimal architectureno of layers learning function and no of neurons at everylayer Moreover the convergence rate and the final accuracyof various configurations do not follow any specific trendSometimes a 2-layer simpler configuration takes more timeto converge than a 4-layer network and the accuracyachieved by a 6-layer network is lower than the accuracyobtained by a 3-layer ANN Hence heuristics are pre-dominantly used for proposing an architecture based onANNs because Monte Carlo configuration testing is

impossible An AP location change or the additionremovalof newAPs will lead to essential retraining of the IPS puttingANN and deep learning at a disadvantage

Inspired from existing works we suggest a clustering-based multiclass classifier approach for room-level pre-diction which is easy to train-and-deploy in terms ofcomputational complexity provides suitable accuracy andoffers response time appropriate for real-time applicationsClustering follows the divide and conquer approach to helpthe classifier better learn the group of similar observationsinstead of the whole dataset We perform soft classification(clustering) of FPs not for RPs which has been mostly donein the existing works PCA-based data dimension reductionhelps us to reduce the response time of the system by de-creasing the number of predictors

(is approach scales well in terms of response time withan increase in the number of rooms since a single classifier isinvoked for each location providing maximum accuracyreported so far to the best of our knowledge

3 Preliminary Experiments

Preliminary experiments were performed on sample datasetcollected from our departmental building to evaluate clas-sifier suitability [28] (e results presented here are on thesample dataset (e detailed experimentation results on thecomplete dataset of all locations in building are presented inSection 5

31 Dataset Acquisition A customized app was developedfor an Android phone built to record RSSI as vector datacoming fromWi-Fi APs(eWi-Fi FPs of all observable APsboth within 24 and 5GHz bands at a RP were scanned usinga commercial off-the-shelf Samsung phone (Version J5Galaxy) FPs were obtained at each RP while hovering thesmart phone starting at 0deg up to right angle (90deg) with respectto the floor as shown in Figure 1 so as to make the phoneface N NW W SW S SE E and NE with an effort to keepocclusion as minimum as possible due to the human torso[5 9] (e user stood at the centre of the RP held the phoneused for FP collection in his hand and capturedmultiple FPsin each direction out of the total 8 directions shown inFigure 1 A total of A APs were present in the premisescontinuously emitting radio signals (ese FPs were thenstored into DB each with respective room labels

32 Preprocessing (e resultant dataset was found to besparsely populated with the identifiers of APs because of thepresenceabsence of these APs at various RPs labelled inrooms and in the corridors of building (e measured RSSIvalues varied between minus98 dBm and minus15 dBm (from beingweak to being strong as a result of distance from APs) Beingconsistent with the well-known practice to keep the missingvalues slightly weaker than the weakest signal detected in thedataset [12 14 18 19] the missing values were replaced withminus100 dBm

Mobile Information Systems 3

33 Classier Evaluation We evaluated performance of 60+classiers in WEKA for room-level prediction on sampledataset out of which top ten best performing classierperformances are summarized in Table 1

Taking into account all the performance measures suchas accuracy to receiver operating characteristics (ROC) areathe overall performance best attained (descending order)was by Klowast k-nearest neighbors (k-NN) Random ForestEnsemble (RFE) and algorithm for Fuzzy Unordered RuleInduction Algorithm (FURIA) multilayer perceptron deeplearning for JAVA (Dl4jMlpClassier) support vectormachines Naive Bayes classier and nally AdaBoost thatuses stumps for decision-making Klowastand k-NN both areinstance-based classiers and produced similar performanceresults Followed by RFE FURIA and multilayer perceptron(ANN) FURIA and ANN show almost similar performancetrend We selected k-NN and ANN for comparison as manyexisting works had utilized these which makes comparisonwith other works easier Moreover we chose Klowast FURIAand DeepLearning4J classiers for comparison too as theyare state-of-the-art and relatively new machine learningmethods RFE is suited for large datasets to give high ac-curacy and time eciency It is resilient to noise in data andis also capable of dealing with missing values in data RFEutilizes bootstrapping that reduces variance and keeps thebias in check because creating dierent subsets of trainingdataset along with a replacement mechanism ensures thatthe trees have little or no correlationerefore overtting isavoided making it more generalizable Both trainingand prediction time of RFE due to parallel computationsupported by bagging make it suited for real-time imple-mentation of IPS Hence we selected RFE [29] as the suitableclassier module in our proposed methodology

4 Proposed Localization Methodology

41 Problem Formulation We assume localizationpositioning as a combination of clustering and multiclass

classication problem where each room is considered to bea class A two-dimensional indoor area is partitioned into Rsquare grids of dimensions CtimesDm2 e centre of eachsquare grid is a reference point (RP) A device equipped withthe wireless adapter card can sense wireless signals froma total of A AP-s at a certain RP at a given time which formsthe ngerprint FPi RVi Li RVi rssii1 rssii2 rssii3 rssiiA where rssiij symbolizes the RSSI value from jth AP(dBm) in the ith sample of FP collected and Li is the re-spective classroom label Let N such FP constitute thedataset Localization function LF is learnt from the FPdataset to map the observed FP to a certain room label Lx asdescribed by the following equation

Lx LF FPi( ) (1)

ere are two phases of the proposed methodology(CEnsLoc) namely training phase and prediction phase asshown in Figure 2 First of all a sparse Wi-Fi FP dataset withmany missing values is collected In the training phase thecollected FPs reserved for training the system are pre-processed for missing values replacement followed by PCAapplication for dimension reduction en GMM-basedhard clustering is used for nonoverlappingdisjoint datasubsets generation For each such subset a separate RFE istrained for location prediction and stored in the database Inthe location prediction phase same steps of missing valuereplacement and PCA computation are performed on thecaptured FP en FP is matched with a single cluster usingthe stored GMM e nal prediction Lx is generated byinvoking the respective pretrained RFE for the best matchedclustersubset ese phases are formally elaborated in detailin Sections 42 and 43 respectively

42 Training Phase During training the training dataset isfed to the preprocessing module which replaces emptyreadings with missing value replacement (MVr) PCA is thenperformed on the dataset for dimension reduction Or-thogonal transformation is applied by PCA for redundantinformation removal to decrease the number of predictorsPrincipal components (PCs) were obtained by applyingPCA which are a set of linearly uncorrelated variables suchthat maximum variance by some projection of the data iscaptured by the rst principal component and so onChoosing the smaller of the number of predictorsAPs andnumber of samples minus one A PCs are generated PC1PC2 PCA For computation of PCs rst the mean RSSIvalue of each AP is subtracted from the ith RP using thefollowing equation

Xi 1NsumN

FPi minus1R

1NsumR

sumN

FPi (2)

where N is the total no of rows of samplesdataset and R isthe total no of RPs e PCs matrix is computed by thefollowing equation

PCi Xi times EA (3)

where EA is the eigenvector matrix of the average RSSI valueof each AP in the ith RPe resulting dataset is then divided

SE

N

S

EW

NENW

SW

AP2

AP1

AP3

APA

Figure 1 User orientation during data collection

4 Mobile Information Systems

into subsets using GMM clustering (K data subsets)Equation (4) is a distribution based on 2D Gaussian wheremean is represented by micro and the covariance matrix is sum AGMM with N as the no of overlapping distributions isdescribed by the following equations

N(x | μΣ) 1

2π|Σ|

radic exp minus12(xminus μ)TΣminus1(xminus μ) (4)

P(x) sumN

k1πkN x μkΣk

∣∣∣∣( ) (5)

sumN

k1πk 1 (6)

where πk denes the mixing coecient to express the weightof each mixing element (weighted sum being 1) e re-sultant shape of 2D Gaussian is the average of distributionsindividually in terms of covariance and mixing coecientsAssuming that a linear mix of weighted coecients for eachof the respective distributionrsquos average and covariance isobtained and by incorporating sucient distributionsa nal density function may be obtained e reason behindGMM clustering was the similarity between Gaussian dis-tribution and the radio propagation characteristics of a Wi-Fi AP [30] which makes GMM a highly suitable candidatefor clustering Wi-Fi RSSI vectors

Furthermore each data subset is used for training a RFEto predict the room label as a multiclass classier etrained models of GMM as well as all RFEs are stored forlater use in the prediction phase

e algorithm for training and prediction phases ofCEnsLoc is given in Algorithm 1

Let the total number of samples beNsample in the trainingset the number of trees is Ntree the number of maximumsplits allowed is Smax the number of predictors for theclassier is Arsquo f is the value specifying no of input predictorsthat are utilized to split at a tree node and tc denotes thetotal no of classes in the dataset For nding best split RFEuses Gini Index as given in the following equation where Pj isthe class jrsquos relative frequency in Nsample

Gini Nsample( ) 1minus sumtc

j1Pj( )

2 (7)

RFE is trained using the method presented inAlgorithm 2

43 PredictionPhase e average of collected RSSI FP is fedto CEnsLoc A PCs are computed by the method similar tothe training phase e saved model of GMM is invoked forcluster matching Matched clusterrsquos trained RFE is invokedwhere the nal decision is computed by the majority votedescribed in Algorithm 2

44 Time Complexity of Training and Prediction Phases forCEnsLoc Ceteris paribus the time complexity of thetraining phase and the prediction phase for CEnsLoc isessentially dependent upon the size of the experimental area

One RFE trained for each cluster

subset

Invokepretrainedrespective

RFE

Wi-Fi RSSI FPswith room labels

Estimated room

Training phasePrediction phase

AP1AP2AP3APA

Fingerprint(s)

RSSI1

RSSI2

RSSI3

RSSIA

Li

Lx

Wi-Fi FPs collection Run time observed Wi-Fi FP

PCA

Preprocessing

Generate data subsets

(GMM clustering)

Cluster matching

Cluster selection

Figure 2 Proposed localization methodology (CEnsLoc)

Table 1 A comparison of performance measures of various classication algorithms for Wi-Fi-based position estimation

Algorithm Time to build model (sec) Accuracy Kappa statistic RMSE Precision Recall F1 MCC ROC areaKlowast 0 9952 098 002 099 099 099 099 1k-NN 0 9906 099 003 099 099 099 098 099RFE 111 9876 098 004 098 098 098 098 1FURIA 592 9726 096 005 097 097 097 096 099Multilayer perceptron 2584 9705 096 006 097 097 097 096 099J48 01 9591 095 008 095 095 095 095 098Dl4jMlp classier 2631 94 093 008 094 094 094 093 099SVM 582 906 089 012 093 090 090 09 094Naive Bayes 002 8979 088 012 091 089 089 089 099AdaBoost with decision stump 01 3681 022 025 015 036 021 018 076

Mobile Information Systems 5

our acquisition regimen the resultant dataset of FPs andhow the dataset is manipulated by the tandem of schemes weemploy

441 Time Complexity of Training In the training phase forPCA time complexity is governed by the following equation

O min A3 N

31113872 11138731113872 1113873 (8)

where A no of predictors and N no of observationsFor training a decision tree (DT) that has not been

pruned the expression is as follows

O(A times N log(N)) (9)

As RFE consists of numerous DTs merely a small no f isused from total predictors A Complexity for a single DT inRFE is represented by Equation (10) and the complexity ofNtree by Equation (11)

O(f times N log(N)) (10)

O Ntree times f times N log(N)( 1113857 (11)

whereNtree no of trees in RFE and f random features thatare chosen to get the best split

While trying to control treesrsquo depth grown using Smaxtraining complexity of one RFE is as follows

O Ntree times f times N times Smax( 1113857 (12)

SinceK ensembles are grown for predicting room level timeto train complexity is represented by the following equation

O Ntree times f times N times Smax times K( 1113857 (13)

For GMM the complexity is expressed by the followingequation

O N times K times D3

1113872 1113873 (14)

where N no of samples K no of components andD no of dimensions

Incorporating all the time complexity to train forCEnsLoc is as follows

O min A3 N

31113872 11138731113872 1113873 + O N times K times D

31113872 1113873

+ O Ntree times f times N times Smax times K( 1113857(15)

442 Time Complexity of Prediction For prediction timecomplexity for PCA is given by the following equation

O min A3 N

31113872 11138731113872 1113873 (16)

(e complexity for a DT and an ensemble in terms ofprediction time are shown by Equations (17) and (18)respectively

O(N log(N)) (17)

O Ntree times N log(N)( 1113857 (18)

Smax controls trees depth therefore complexity for anensemble is given by the following equation

O Ntree times N times Smax( 1113857 (19)

Only a single RFE out of K ensembles gets invoked forpredicting a room

(erefore time complexity for GMM is given by thefollowing equation

O K times D3

1113872 1113873 (20)

Finally the overall complexity for CEnsLoc in terms ofprediction is as follows

O min A3 N

31113872 11138731113872 1113873 + O K times D

31113872 1113873 + O Ntree times N times Smax( 1113857

(21)

5 Experimental Results and Discussion

(is section entails hardware equipment software used andparticulars of experiments that were used to evaluate theperformance of CEnsLoc in light of accuracy precisionrecall training and response time

An Intel machine (64-bit Xeon X5650) with a masterclock at 267 GHz with 24GB RAM and 64-bit Windows 10Education was used for experimentation in MATLAB (ereal dataset was developed through FP collection at theground floor of Software Engineering (SE) Centre Uni-versity of Engineering and Technology (UET) LahorePakistan (e buildingrsquos dimensions are 39m times 31m(1209m2) containing offices class rooms laboratories andopen corridors Figures 3 and 4 depict the buildingrsquos floorplan room labels (L1ndashL10 closed rooms L12 open corri-dor and L11 a semiopen room) a total of 180 RPs and the

Input training dataset with total A predictorsMissing value replacement MVrMaximum number of clusters Kmax

Output predicted location LxFor trainingReplace empty values with MVrApply PCA on dataset to generate Arsquo predictorsFor k 1 -gtKmaxGenerate clustersGenerate and save k data subsetsFor each p є k data subsetsTrain p RFE using Algorithm 2 (training)Calculate performance measures

End forEnd forChoose optimal configurationSave respective models for GMM all RFEsFor prediction at a new point xReplace missing values with MVrApply PCA on the FPMatch one cluster CmatchInvoke RFE of Cmatch using Algorithm 2 (prediction)

ALGORITHM 1 CEnsLoc training and prediction algorithm

6 Mobile Information Systems

number of samples collected per room RPs that are rep-resented by small coloured dots in rooms in Figures 3 and 4enlist the total number of samples collected at each locationL1ndashL12 Such notion of rooms was used as walls playinga crucial role in fluctuation of Wi-Fi RSSI values [31 32](e area was planned into a grid of cells of 15times15m2Each cell center was marked as the Reference Point (RP) forFP collection

(e complete dataset consisted of 20087Wi-Fi RSSI FPsin which total 40APs were detected Figure 4 depicts thenumber of FPs captured in each roomlocation marked asL1ndashL12 It must be noted that all these APs belong touniversity infrastructure comprising of SE centre and itsimmediate neighboring buildings (e FPs were pre-processed following the same process described in Sections31 and 32 PCA-based dimension reduction was employedwith optimal results found with 23 PCs(e resultant datasetwas divided with 70 30 stratified ratio for training andtesting subsets (e experimental results are discussed usingboth 10-fold cross validation (10-CV) on training subset andon unseen 30 test subset GMM clustering then partitionedthe training data into further subsets and the optimalconfiguration was two clusters shared covariance kept astrue and diagonal covariance Furthermore one RFE wastrained for each subset with 132 trees 1024 maximum splitsand 8 random features CEnsLoc was hosted at a machineand the mean of observed RSSI FPs was used for run timelocation estimation after applying the same model of PCAand GMM cluster matching finalized from the trainingphase Best matched clusterrsquos respective RFE was invoked forroom prediction Various companion apps can query lo-cation of subscribed users using the IPS configured asa server Response time was computed as an average of thedifference between localization query time and predictiongeneration time (e following formulae were used tocompute the performance parameters

accuracy TP + TN

TP + TN + FP + FN

precision TP

TP + FP

recall TP

TP + FN

(22)

51 Classification Effectiveness and Efficiency Tables 2 and 3summarize the 10-CV performance evaluation of CEnsLoccomparing it with k-NN [33] artificial neural network(ANN) [34] Klowast [35] FURIA [36] and DeepLearning4J [36](e best results are highlighted throughout with boldface k-NN results were computed averaged over six differentconfigurations based on the number of neighbors similaritymeasure employed and neighbor weightage Similarly sixdifferent configurations of ANN namely 2- 3- and 4-layerANNs each having varying number of neurons per layeremploying two learning algorithms SCG and RBP wereaveraged out to obtain the results For Klowast the entropic blendpercentage was varied from 10 to 90 percent (e resultsobtained for FURIA were obtained by varying the count offolds for growth and pruning as well as through varyingminimum instances weight that was for each split Deep-Learning4J results were obtained by varying the number ofneurons per layer number of dense layers in the networkand training algorithm out of many possible variations in thehyper parameters(e results by CEnsLoc were generated bythe found optimal configuration of our proposed approachFor all the approaches the same set of configurations asused during 10-CV performance evaluation were used forresult generation on 30 unseen stratified test data subsetwhose results are presented in Table 4(e best performanceon both 10-CV results (Table 2) and test dataset (Table 4)

Input data subset with total Arsquo predictors for trainingNo of tree NtreeAllowed maximum no of splits SmaxRandom no of predictorsfeatures f

Output estimated location LxFor system trainingStep 1 for l 1 to Ntree

(i) Choose a bootstrap sample set (SS) of size (Nsample) with replacement from the training data subset(ii) Generate a Random Forest Tree (Tl) to SS via recursively iterating (a-c) for every terminal node of tree unless the maximum no of

splits (Smax) is reached

(a) Randomly select f featuresvariables from the Arsquo predictors (fltltArsquo)(b) Choose the best featuressplit-point from the f employing Gini Index(c) Split node forming into two children nodes

Step 2 Produce the resulting ensemble of trees Tl1Ntree

For location prediction at a new point x from RFE LrfNtreeLet Lm(x) be the roomclass prediction by the mth RFE treeLrfNtree(x)maj vote Lm(x)1Ntree

ALGORITHM 2 RFE classification algorithm for training and prediction

Mobile Information Systems 7

were obtained by CEnsLoc with 97 and 95 accuracyfollowed by FURIA (92 90) k-NN (91 89) Klowast (9087) ANN (85 82) and DeepLearning4J (73 71)respectively e results are presented on both 10-CV andunseen test data subset to show that for a small dataset 10-CV results can provide a good performance estimate asresults on the test data subset were found to be showingsimilar trends as depicted by the 10-CV results Althoughdeep leaning and ANN provide good performance resultsbut nding optimal conguration for a huge number oftunable parameters is a tricky task Moreover during ex-periments it was found that despite generic guidelines forparameter tuning the performance measures and conver-gence rate of ANN as well as deep learning schemes highlynotuctuate with even a slight variation in the number ofneurons in layers training algorithm and number of hiddenlayers which makes it harder to nd optimal congurationfor ANN and deep learning models Lazy and instance-basedapproaches such as k-NN and Klowast are also good candidates

for indoor localization however they have a very limitednumber of tunable parameters resulting in inability tosurpass CEnsLoc performance despite trying dierent pa-rameter combinations ey do not generalize well ascompared to RFE as the end result of prediction is heavilydependent on the majority vote by k closest matchedsamples in the dataset ey also need template matchingwith the entire dataset for one location prediction whichresults in growing response time with increasing numberof samples in the dataset which is highly likely in practicalreal-world scenarios As in a typical building there area quite huge number of visible APs and a suciently largenumber of samples are also required for classiers to workproperly

A minimum response time of 205Eminus 05 seconds wasobtained by FURIA DeepLearning4J stood second with682Eminus 05 seconds response time ANN k-NN and CEn-sLoc all had response time on the scale of Eminus 04 secondswhich is 10 times slower than aforementioned two

L1 L2 L3

L4

L5

L6

L7

L8

L10L9 L12

L11

L12

L12

L12

L12

Figure 3 Room labels and marked RPs

446 446717

2700 2700 2700 2700

1446

250743 743

4496

0500

100015002000250030003500400045005000

Number of samplesLocations

Num

ber o

f sam

ples

L1L2L3

L4L5L6

L7L8L9

L10L11L12

Figure 4 Number of samples per room

8 Mobile Information Systems

approaches but the difference was trifling which cannot bedetected by any human user of the system the accuracyprecision and recall provided by CEnsLoc was much greaterthan all other approaches

FURIA stands out as the second best performer re-garding indoor localization which is an upgraded version ofthe RIPPER algorithm indicating that rule-based algo-rithms specially fuzzified versions with good generalizationcapabilities perform well for location estimation as wellHowever it lagged behind CEnsLoc in terms of accuracyprecision and recall by 5 10 and 15 respectively

(e details of both 10-CV and test dataset results for allIPS compared and CEnsLoc are depicted in Figure 5 for sideby side visual comparison

52 Out-of-Bag (OOB) Error Results (ere is a performancemeasure called OOB error which is peculiar to RFE Duringtraining of RFE data subsets are generated with re-placement resulting in some repeated and left out obser-vations (OOB observations) for each tree (at particulartree does not train on these left out OOB observationsPrediction capability of RFE can be measured using ldquoOOBLossrdquo which is the error made on unseen OOB observations

during training OOB loss measure has been investigatedand shown to provide an upper bound on testing error [37]specifically useful for small-sized datasets Hence OOBerror can be used just likeinstead of unseen test data subsetif the available dataset is small or unseen test dataset isunavailable It provides a very good estimate of the trainedclassifierrsquos generalization capability Table 5 summarizes theOOB loss compared with averaged out 10-CV loss indicatingthat it indeed bounds it

6 Conclusion and Future Work

Location predictionestimation provides derivation ofmeaningful context for a broad range of services and ap-plications Indoor localization can open altogether a pleth-ora of new opportunities because humans spend most oftheir time indoors CEnsLoc offers shorter response timeand an overall improvement in accuracy precision andrecall With only a few parameters to be tuned it is suited forFP-based localization which requires frequent recollection ofdata and retraining CEnsLoc was able to attain 97 ac-curacy in comparison with other IPS averaged over 6 dif-ferent configurations namely FURIA k-NN Klowast ANN andDeepLearning4J with 92 91 90 85 and 73 ac-curacies respectively It can be utilized for elderly assistancenavigation smart buildings and smart transportation toname a few potential applications

Our future work includes deployment of CEnsLoc acrossa wide range of civil infrastructures including different floorsandor buildings to understand its performance in moredetail as well as its scalability using crowdsourcing We alsoaim to build safety security and evacuation guide appli-cations with CEnsLoc at their core for users in officesuniversities and retail Furthermore integration with GPSBluetooth and PDR to further enhance accuracy utilizing

Table 2 Tenfold cross-validated performance evaluation and comparison of CEnsLoc

Accuracy Precision Recallk-NN 091 088 087ANN 085 083 078Klowast 090 096 062FURIA 092 089 082DeepLearning4J 073 051 041CEnsLoc 097 098 097

Table 3 Tenfold cross-validated performance comparison of CEnsLoc in terms of training and response time (seconds)

Time (sec) Training time (10-fold) Avg training time (1-fold) Response time dataset Response time (1 sample)k-NN mdash mdash 197 170Eminus 04ANN 26720 2672 018 141Eminus 04Klowast mdash mdash 10309 817Eminus 02FURIA 1587 158 003 205E2 05DeepLearning4J 75559 755 009 682Eminus 05CEnsLoc 14007 14007 235 208Eminus 04

Table 4 Performance evaluation and comparison of CEnsLoc ontest subset

Accuracy Precision Recallk-NN 089 087 085ANN 082 081 076Klowast 087 094 060FURIA 090 086 079DeepLearning4J 071 048 039CEnsLoc 095 096 094

Mobile Information Systems 9

available hybrid technologies at a given time is also part ofthe plan

Data Availability

e data used to support the ndings of this study areavailable from the corresponding author upon request

Conflicts of Interest

e authors declare that they have no connoticts of interest

Acknowledgments

is research was supported by Basic Science ResearchProgram through the National Research Foundation ofKorea (NRF) funded by the Ministry of Education (NRF-2018R1D1A1B07048697)

References

[1] J Torres-Sospedra R Montoliu S Trilles O Belmonte andJ Huerta ldquoComprehensive analysis of distance and similaritymeasures for Wi-Fi ngerprinting indoor positioning sys-temsrdquo Expert Systems with Applications vol 42 no 23pp 9263ndash9278 2015

[2] H Mehmood and N K Tripathi ldquoCascading articial neuralnetworks optimized by genetic algorithms and integrated withglobal navigation satellite system to oer accurate ubiquitouspositioning in urban environmentrdquo Computers Environmentand Urban Systems vol 37 no 1 pp 35ndash44 2013

[3] M M Soltani A Motamedi and A Hammad ldquoEnhancingcluster-based RFID tag localization using articial neuralnetworks and virtual reference tagsrdquo Automation in Con-struction vol 54 pp 93ndash105 2015

[4] M L Rodrigues L F M Vieira and M F M CamposldquoFingerprinting-based radio localization in indoor environ-ments using multiple wireless technologiesrdquo in Proceedings ofIEEE International Symposium on Personal Indoor andMobile Radio Communications pp 1203ndash1207 Toronto ONCanada September 2011

[5] J Luo and H Gao ldquoDeep belief networks for ngerprintingindoor localization using ultrawideband technologyrdquo In-ternational Journal of Distributed Sensor Networks vol 12no 1 article 5840916 2016

[6] S Saeedi L Paull M Trentini and H Li ldquoNeural network-based multiple robot simultaneous localization and map-pingrdquo IEEE Transactions on Neural Networks vol 22 no 12pp 880ndash885 2011

[7] M Azizyan R R Choudhury and I Constandache ldquoSur-roundSense mobile phone localization via ambience nger-printingrdquo in Proceedings of the 15th annual internationalconference on Mobile computing and networking-Mobi-Comrsquo09 pp 261ndash272 Beijing China 2009

[8] L Pei R Chen J Liu et al ldquoMotion Recognition AssistedIndoor Wireless Navigation on a Mobile Phonerdquo in Pro-ceedings of the 23rd International Technical Meeting of eSatellite Division of the Institute of Navigation pp 3366ndash3375Portland OR USA September 2010

[9] P S Nagpal and R Rashidzadeh ldquoIndoor positioning usingmagnetic compass and accelerometer of smartphonesrdquo inProceedings of International Conference on Selected Topics inMobile and Wireless Networking (MoWNeT) pp 140ndash145Montreal Canada 2013

[10] J Menke and A Zakhor ldquoMulti-modal indoor positioning ofmobile devicesrdquo in Proceedings of IEEE International Con-ference on Indoor Positioning and Indoor Navigationpp 13ndash16 Alberta Canada October 2015

[11] M Oussalah M Alakhras and M I Hussein ldquoMultivariablefuzzy inference system for ngerprinting indoor localizationrdquoFuzzy Sets and Systems vol 269 pp 65ndash89 2015

[12] N Li J Chen Y Yuan X Tian Y Han and M Xia ldquoA wi-indoor localization strategy using particle swarm optimizationbased articial neural networksrdquo International Journal ofDistributed Sensor Networks vol 12 no 3 article 45831472016

002040608

112

k-N

N

4-la

yer A

NN

(SCG

)

4-la

yer A

NN

(RBP

)

3-la

yer A

NN

(SCG

)

3-la

yer A

NN

(RBP

)

2-la

yer A

NN

(SCG

)

2-la

yer A

NN

(RBP

)

Klowast

FURI

A

Dee

pLea

rnin

g4J

CEns

Loc

k-N

N

4-la

yer A

NN

(SCG

)

4-la

yer A

NN

(RBP

)

3-la

yer A

NN

(SCG

)

3-la

yer A

NN

(RBP

)

2-la

yer A

NN

(SCG

)

2-la

yer A

NN

(RBP

)

Klowast

FURI

A

Dee

pLea

rnin

g4J

CEns

Loc

10-CV Test dataset

PrecisionRecallAccuracy

Figure 5 Performance measures on both 10-CV and test dataset

Table 5 OOB loss

Average 10-fold loss OOB loss00292813 00300123

10 Mobile Information Systems

[13] C Song J Wang and G Yuan ldquoHidden naive bayes indoorfingerprinting localization based on best-discriminating APselectionrdquo ISPRS International Journal of Geo-Informationvol 5 no 10 p 189 2016

[14] P Bahl and V N Padmanabhan ldquoRADAR an in-building RFbased user location and tracking systemrdquo in Proceedings ofIEEE INFOCOM 2000 Conference on Computer Communi-cations Nineteenth Annual Joint Conference of the IEEEComputer and Communications Societies (Cat No00CH37064) vol 2 pp 775ndash784 Tel Aviv Israel March 2000

[15] M Cooper J Biehl G Filby and S Kratz ldquoLoCo boosting forindoor location classification combining Wi-Fi and BLErdquoPersonal and Ubiquitous Computing vol 20 no 1 pp 1ndash142016

[16] C Wang F Wu Z Shi and D Zhang ldquoIndoor positioningtechnique by combining RFID and particle swarmoptimization-based back propagation neural networkrdquo Optik(Stuttg) vol 127 no 17 pp 6839ndash6849 2016

[17] J Xu H Dai and W Ying ldquoMulti-layer neural network forreceived signal strength-based indoor localisationrdquo IETCommunications vol 10 no 6 pp 717ndash723 2016

[18] W Zhang K Liu W Zhang Y Zhang and J Gu ldquoDeepneural networks for wireless localization in indoor andoutdoor environmentsrdquo Neurocomputing vol 194 pp 279ndash287 2016

[19] L Calderoni M Ferrara A Franco and D Maio ldquoIndoorlocalization in a hospital environment using Random Forestclassifiersrdquo Expert Systems with Applications vol 42 no 1pp 125ndash134 2015

[20] E Jedari Z Wu R Rashidzadeh and M Saif ldquoWi-Fi basedindoor location positioning employing random forest clas-sifierrdquo in Proceedings of 2015 International Conference onIndoor Positioning and Indoor Navigation (IPIN) pp 13ndash16Banff Albeta Canada October 2015

[21] Y Mo Z Zhang Y Lu W Meng and G Agha ldquoRandomforest based coarse locating and KPCA feature extraction forindoor positioning systemrdquo Mathematical Problems in En-gineering vol 2014 Article ID 850926 8 pages 2014

[22] R Gorak and M Luckner ldquoMalfunction immune WindashFilocalisationmethodrdquo in Computational Collective Intelligencepp 328ndash337 Springer Cham Switzerland 2015

[23] R Gorak and M Luckner ldquoModified random forest algorithmfor WindashFi indoor localization systemrdquo in Proceedings of In-ternational Conference on Computational Collective Intelligencepp 147ndash157 Cham Switzerland September 2016

[24] J Niu B Wang L Cheng and J J P C Rodrigues ldquoWicLocan indoor localization system based on WiFi fingerprints andcrowdsourcingrdquo in Proceedings of 2015 IEEE InternationalConference on Communications (ICC) pp 3008ndash3013 Lon-don UK June 2015

[25] G Ding Z Tan J Zhang and L Zhang ldquoFingerprintinglocalization based on affinity propagation clustering and ar-tificial neural networksrdquo in Proceedings of IEEE WirelessCommunications and Networking Conference (WCNC)pp 2317ndash2322 Shanghai China April 2013

[26] O Leon J Hernandez-Serrano and M Soriano ldquoSecuringcognitive radio networksrdquo International Journal of Commu-nication Systems vol 23 no 5 pp 633ndash652 2010

[27] S Tuncer and T Tuncer ldquoIndoor localization with bluetoothtechnology using artificial neural networksrdquo in Proceedings ofthe IEEE 19th International Conference on Intelligent Engi-neering Systems pp 213ndash217 Bratislava Slovakia September2015

[28] B A Akram A H Akbar B Wajid O Shafiq and A ZafarldquoLocSwayamwar finding a suitable ML algorithm for wi-fifingerprinting based indoor positioning systemrdquo in LectureNotes in Electrical Engineering A Boyaci A Ekti M Aydinand S Yarkan Eds vol 504 Springer Singapore 2018

[29] L Breiman ldquoRandom forestsrdquo Machine Learning vol 45no 1 pp 5ndash32 2001

[30] K Kaji and N Kawaguchi ldquoDesign and implementation ofwifi indoor localization based on Gaussian mixture model andparticle filterrdquo in Proceedings of 2012 International Conferenceon Indoor Positioning and Indoor Navigation (IPIN) pp 1ndash9Sydney Australia November 2012

[31] C Wu Z Yang Y Liu and W Xi ldquoWILL wireless indoorlocalization without site surveyrdquo IEEE Transactions on Par-allel and Distributed Systems vol 24 no 4 pp 839ndash848 2013

[32] S Yang P Dessai M Verma and M Gerla ldquoFreeLoccalibration-free crowdsourced indoor localizationrdquo in Pro-ceedings of IEEE INFOCOM pp 2481ndash2489 Turin Italy April2013

[33] T Seidl and H-P Kriegel ldquoOptimal multi-step k-nearestneighbor searchrdquo ACM SIGMOD Record vol 27 no 2pp 154ndash165 1998

[34] W S Mcculloch and W Pitts ldquoA logical calculus nervousactivityrdquo Bulletin of Mathematical Biology vol 52 no l-2pp 99ndash115 1990

[35] J G Cleary and L E Trigg ldquoKlowast an instance-based learnerusing an entropic distance measurerdquo in Proceedings of TwelfthInternational Conference on Machine Learning vol 5 pp 1ndash14 Tahoe City CA USA July 1995

[36] J Huhn and E Hullermeier ldquoFURIA an algorithm for un-ordered fuzzy rule inductionrdquo Data Mining and KnowledgeDiscovery vol 19 no 3 pp 293ndash319 2009

[37] S Ciss Generalization Error and Out-of-bag Bounds inRandom (Uniform) Forests PhD thesis French UniversityParis France 2015

Mobile Information Systems 11

Computer Games Technology

International Journal of

Hindawiwwwhindawicom Volume 2018

Hindawiwwwhindawicom

Journal ofEngineeringVolume 2018

Advances in

FuzzySystems

Hindawiwwwhindawicom

Volume 2018

International Journal of

ReconfigurableComputing

Hindawiwwwhindawicom Volume 2018

Hindawiwwwhindawicom Volume 2018

Applied Computational Intelligence and Soft Computing

thinspAdvancesthinspinthinsp

thinspArtificial Intelligence

Hindawiwwwhindawicom Volumethinsp2018

Hindawiwwwhindawicom Volume 2018

Civil EngineeringAdvances in

Hindawiwwwhindawicom Volume 2018

Electrical and Computer Engineering

Journal of

Journal of

Computer Networks and Communications

Hindawiwwwhindawicom Volume 2018

Hindawi

wwwhindawicom Volume 2018

Advances in

Multimedia

International Journal of

Biomedical Imaging

Hindawiwwwhindawicom Volume 2018

Hindawiwwwhindawicom Volume 2018

Engineering Mathematics

International Journal of

RoboticsJournal of

Hindawiwwwhindawicom Volume 2018

Hindawiwwwhindawicom Volume 2018

Computational Intelligence and Neuroscience

Hindawiwwwhindawicom Volume 2018

Mathematical Problems in Engineering

Modelling ampSimulationin EngineeringHindawiwwwhindawicom Volume 2018

Hindawi Publishing Corporation httpwwwhindawicom Volume 2013Hindawiwwwhindawicom

The Scientific World Journal

Volume 2018

Hindawiwwwhindawicom Volume 2018

Human-ComputerInteraction

Advances in

Hindawiwwwhindawicom Volume 2018

Scientic Programming

Submit your manuscripts atwwwhindawicom

Page 4: CEnsLoc: Infrastructure-Less Indoor Localization ...downloads.hindawi.com/journals/misy/2018/3287810.pdf · 4.Proposed Localization Methodology 4.1. Problem Formulation. We assume

33 Classier Evaluation We evaluated performance of 60+classiers in WEKA for room-level prediction on sampledataset out of which top ten best performing classierperformances are summarized in Table 1

Taking into account all the performance measures suchas accuracy to receiver operating characteristics (ROC) areathe overall performance best attained (descending order)was by Klowast k-nearest neighbors (k-NN) Random ForestEnsemble (RFE) and algorithm for Fuzzy Unordered RuleInduction Algorithm (FURIA) multilayer perceptron deeplearning for JAVA (Dl4jMlpClassier) support vectormachines Naive Bayes classier and nally AdaBoost thatuses stumps for decision-making Klowastand k-NN both areinstance-based classiers and produced similar performanceresults Followed by RFE FURIA and multilayer perceptron(ANN) FURIA and ANN show almost similar performancetrend We selected k-NN and ANN for comparison as manyexisting works had utilized these which makes comparisonwith other works easier Moreover we chose Klowast FURIAand DeepLearning4J classiers for comparison too as theyare state-of-the-art and relatively new machine learningmethods RFE is suited for large datasets to give high ac-curacy and time eciency It is resilient to noise in data andis also capable of dealing with missing values in data RFEutilizes bootstrapping that reduces variance and keeps thebias in check because creating dierent subsets of trainingdataset along with a replacement mechanism ensures thatthe trees have little or no correlationerefore overtting isavoided making it more generalizable Both trainingand prediction time of RFE due to parallel computationsupported by bagging make it suited for real-time imple-mentation of IPS Hence we selected RFE [29] as the suitableclassier module in our proposed methodology

4 Proposed Localization Methodology

41 Problem Formulation We assume localizationpositioning as a combination of clustering and multiclass

classication problem where each room is considered to bea class A two-dimensional indoor area is partitioned into Rsquare grids of dimensions CtimesDm2 e centre of eachsquare grid is a reference point (RP) A device equipped withthe wireless adapter card can sense wireless signals froma total of A AP-s at a certain RP at a given time which formsthe ngerprint FPi RVi Li RVi rssii1 rssii2 rssii3 rssiiA where rssiij symbolizes the RSSI value from jth AP(dBm) in the ith sample of FP collected and Li is the re-spective classroom label Let N such FP constitute thedataset Localization function LF is learnt from the FPdataset to map the observed FP to a certain room label Lx asdescribed by the following equation

Lx LF FPi( ) (1)

ere are two phases of the proposed methodology(CEnsLoc) namely training phase and prediction phase asshown in Figure 2 First of all a sparse Wi-Fi FP dataset withmany missing values is collected In the training phase thecollected FPs reserved for training the system are pre-processed for missing values replacement followed by PCAapplication for dimension reduction en GMM-basedhard clustering is used for nonoverlappingdisjoint datasubsets generation For each such subset a separate RFE istrained for location prediction and stored in the database Inthe location prediction phase same steps of missing valuereplacement and PCA computation are performed on thecaptured FP en FP is matched with a single cluster usingthe stored GMM e nal prediction Lx is generated byinvoking the respective pretrained RFE for the best matchedclustersubset ese phases are formally elaborated in detailin Sections 42 and 43 respectively

42 Training Phase During training the training dataset isfed to the preprocessing module which replaces emptyreadings with missing value replacement (MVr) PCA is thenperformed on the dataset for dimension reduction Or-thogonal transformation is applied by PCA for redundantinformation removal to decrease the number of predictorsPrincipal components (PCs) were obtained by applyingPCA which are a set of linearly uncorrelated variables suchthat maximum variance by some projection of the data iscaptured by the rst principal component and so onChoosing the smaller of the number of predictorsAPs andnumber of samples minus one A PCs are generated PC1PC2 PCA For computation of PCs rst the mean RSSIvalue of each AP is subtracted from the ith RP using thefollowing equation

Xi 1NsumN

FPi minus1R

1NsumR

sumN

FPi (2)

where N is the total no of rows of samplesdataset and R isthe total no of RPs e PCs matrix is computed by thefollowing equation

PCi Xi times EA (3)

where EA is the eigenvector matrix of the average RSSI valueof each AP in the ith RPe resulting dataset is then divided

SE

N

S

EW

NENW

SW

AP2

AP1

AP3

APA

Figure 1 User orientation during data collection

4 Mobile Information Systems

into subsets using GMM clustering (K data subsets)Equation (4) is a distribution based on 2D Gaussian wheremean is represented by micro and the covariance matrix is sum AGMM with N as the no of overlapping distributions isdescribed by the following equations

N(x | μΣ) 1

2π|Σ|

radic exp minus12(xminus μ)TΣminus1(xminus μ) (4)

P(x) sumN

k1πkN x μkΣk

∣∣∣∣( ) (5)

sumN

k1πk 1 (6)

where πk denes the mixing coecient to express the weightof each mixing element (weighted sum being 1) e re-sultant shape of 2D Gaussian is the average of distributionsindividually in terms of covariance and mixing coecientsAssuming that a linear mix of weighted coecients for eachof the respective distributionrsquos average and covariance isobtained and by incorporating sucient distributionsa nal density function may be obtained e reason behindGMM clustering was the similarity between Gaussian dis-tribution and the radio propagation characteristics of a Wi-Fi AP [30] which makes GMM a highly suitable candidatefor clustering Wi-Fi RSSI vectors

Furthermore each data subset is used for training a RFEto predict the room label as a multiclass classier etrained models of GMM as well as all RFEs are stored forlater use in the prediction phase

e algorithm for training and prediction phases ofCEnsLoc is given in Algorithm 1

Let the total number of samples beNsample in the trainingset the number of trees is Ntree the number of maximumsplits allowed is Smax the number of predictors for theclassier is Arsquo f is the value specifying no of input predictorsthat are utilized to split at a tree node and tc denotes thetotal no of classes in the dataset For nding best split RFEuses Gini Index as given in the following equation where Pj isthe class jrsquos relative frequency in Nsample

Gini Nsample( ) 1minus sumtc

j1Pj( )

2 (7)

RFE is trained using the method presented inAlgorithm 2

43 PredictionPhase e average of collected RSSI FP is fedto CEnsLoc A PCs are computed by the method similar tothe training phase e saved model of GMM is invoked forcluster matching Matched clusterrsquos trained RFE is invokedwhere the nal decision is computed by the majority votedescribed in Algorithm 2

44 Time Complexity of Training and Prediction Phases forCEnsLoc Ceteris paribus the time complexity of thetraining phase and the prediction phase for CEnsLoc isessentially dependent upon the size of the experimental area

One RFE trained for each cluster

subset

Invokepretrainedrespective

RFE

Wi-Fi RSSI FPswith room labels

Estimated room

Training phasePrediction phase

AP1AP2AP3APA

Fingerprint(s)

RSSI1

RSSI2

RSSI3

RSSIA

Li

Lx

Wi-Fi FPs collection Run time observed Wi-Fi FP

PCA

Preprocessing

Generate data subsets

(GMM clustering)

Cluster matching

Cluster selection

Figure 2 Proposed localization methodology (CEnsLoc)

Table 1 A comparison of performance measures of various classication algorithms for Wi-Fi-based position estimation

Algorithm Time to build model (sec) Accuracy Kappa statistic RMSE Precision Recall F1 MCC ROC areaKlowast 0 9952 098 002 099 099 099 099 1k-NN 0 9906 099 003 099 099 099 098 099RFE 111 9876 098 004 098 098 098 098 1FURIA 592 9726 096 005 097 097 097 096 099Multilayer perceptron 2584 9705 096 006 097 097 097 096 099J48 01 9591 095 008 095 095 095 095 098Dl4jMlp classier 2631 94 093 008 094 094 094 093 099SVM 582 906 089 012 093 090 090 09 094Naive Bayes 002 8979 088 012 091 089 089 089 099AdaBoost with decision stump 01 3681 022 025 015 036 021 018 076

Mobile Information Systems 5

our acquisition regimen the resultant dataset of FPs andhow the dataset is manipulated by the tandem of schemes weemploy

441 Time Complexity of Training In the training phase forPCA time complexity is governed by the following equation

O min A3 N

31113872 11138731113872 1113873 (8)

where A no of predictors and N no of observationsFor training a decision tree (DT) that has not been

pruned the expression is as follows

O(A times N log(N)) (9)

As RFE consists of numerous DTs merely a small no f isused from total predictors A Complexity for a single DT inRFE is represented by Equation (10) and the complexity ofNtree by Equation (11)

O(f times N log(N)) (10)

O Ntree times f times N log(N)( 1113857 (11)

whereNtree no of trees in RFE and f random features thatare chosen to get the best split

While trying to control treesrsquo depth grown using Smaxtraining complexity of one RFE is as follows

O Ntree times f times N times Smax( 1113857 (12)

SinceK ensembles are grown for predicting room level timeto train complexity is represented by the following equation

O Ntree times f times N times Smax times K( 1113857 (13)

For GMM the complexity is expressed by the followingequation

O N times K times D3

1113872 1113873 (14)

where N no of samples K no of components andD no of dimensions

Incorporating all the time complexity to train forCEnsLoc is as follows

O min A3 N

31113872 11138731113872 1113873 + O N times K times D

31113872 1113873

+ O Ntree times f times N times Smax times K( 1113857(15)

442 Time Complexity of Prediction For prediction timecomplexity for PCA is given by the following equation

O min A3 N

31113872 11138731113872 1113873 (16)

(e complexity for a DT and an ensemble in terms ofprediction time are shown by Equations (17) and (18)respectively

O(N log(N)) (17)

O Ntree times N log(N)( 1113857 (18)

Smax controls trees depth therefore complexity for anensemble is given by the following equation

O Ntree times N times Smax( 1113857 (19)

Only a single RFE out of K ensembles gets invoked forpredicting a room

(erefore time complexity for GMM is given by thefollowing equation

O K times D3

1113872 1113873 (20)

Finally the overall complexity for CEnsLoc in terms ofprediction is as follows

O min A3 N

31113872 11138731113872 1113873 + O K times D

31113872 1113873 + O Ntree times N times Smax( 1113857

(21)

5 Experimental Results and Discussion

(is section entails hardware equipment software used andparticulars of experiments that were used to evaluate theperformance of CEnsLoc in light of accuracy precisionrecall training and response time

An Intel machine (64-bit Xeon X5650) with a masterclock at 267 GHz with 24GB RAM and 64-bit Windows 10Education was used for experimentation in MATLAB (ereal dataset was developed through FP collection at theground floor of Software Engineering (SE) Centre Uni-versity of Engineering and Technology (UET) LahorePakistan (e buildingrsquos dimensions are 39m times 31m(1209m2) containing offices class rooms laboratories andopen corridors Figures 3 and 4 depict the buildingrsquos floorplan room labels (L1ndashL10 closed rooms L12 open corri-dor and L11 a semiopen room) a total of 180 RPs and the

Input training dataset with total A predictorsMissing value replacement MVrMaximum number of clusters Kmax

Output predicted location LxFor trainingReplace empty values with MVrApply PCA on dataset to generate Arsquo predictorsFor k 1 -gtKmaxGenerate clustersGenerate and save k data subsetsFor each p є k data subsetsTrain p RFE using Algorithm 2 (training)Calculate performance measures

End forEnd forChoose optimal configurationSave respective models for GMM all RFEsFor prediction at a new point xReplace missing values with MVrApply PCA on the FPMatch one cluster CmatchInvoke RFE of Cmatch using Algorithm 2 (prediction)

ALGORITHM 1 CEnsLoc training and prediction algorithm

6 Mobile Information Systems

number of samples collected per room RPs that are rep-resented by small coloured dots in rooms in Figures 3 and 4enlist the total number of samples collected at each locationL1ndashL12 Such notion of rooms was used as walls playinga crucial role in fluctuation of Wi-Fi RSSI values [31 32](e area was planned into a grid of cells of 15times15m2Each cell center was marked as the Reference Point (RP) forFP collection

(e complete dataset consisted of 20087Wi-Fi RSSI FPsin which total 40APs were detected Figure 4 depicts thenumber of FPs captured in each roomlocation marked asL1ndashL12 It must be noted that all these APs belong touniversity infrastructure comprising of SE centre and itsimmediate neighboring buildings (e FPs were pre-processed following the same process described in Sections31 and 32 PCA-based dimension reduction was employedwith optimal results found with 23 PCs(e resultant datasetwas divided with 70 30 stratified ratio for training andtesting subsets (e experimental results are discussed usingboth 10-fold cross validation (10-CV) on training subset andon unseen 30 test subset GMM clustering then partitionedthe training data into further subsets and the optimalconfiguration was two clusters shared covariance kept astrue and diagonal covariance Furthermore one RFE wastrained for each subset with 132 trees 1024 maximum splitsand 8 random features CEnsLoc was hosted at a machineand the mean of observed RSSI FPs was used for run timelocation estimation after applying the same model of PCAand GMM cluster matching finalized from the trainingphase Best matched clusterrsquos respective RFE was invoked forroom prediction Various companion apps can query lo-cation of subscribed users using the IPS configured asa server Response time was computed as an average of thedifference between localization query time and predictiongeneration time (e following formulae were used tocompute the performance parameters

accuracy TP + TN

TP + TN + FP + FN

precision TP

TP + FP

recall TP

TP + FN

(22)

51 Classification Effectiveness and Efficiency Tables 2 and 3summarize the 10-CV performance evaluation of CEnsLoccomparing it with k-NN [33] artificial neural network(ANN) [34] Klowast [35] FURIA [36] and DeepLearning4J [36](e best results are highlighted throughout with boldface k-NN results were computed averaged over six differentconfigurations based on the number of neighbors similaritymeasure employed and neighbor weightage Similarly sixdifferent configurations of ANN namely 2- 3- and 4-layerANNs each having varying number of neurons per layeremploying two learning algorithms SCG and RBP wereaveraged out to obtain the results For Klowast the entropic blendpercentage was varied from 10 to 90 percent (e resultsobtained for FURIA were obtained by varying the count offolds for growth and pruning as well as through varyingminimum instances weight that was for each split Deep-Learning4J results were obtained by varying the number ofneurons per layer number of dense layers in the networkand training algorithm out of many possible variations in thehyper parameters(e results by CEnsLoc were generated bythe found optimal configuration of our proposed approachFor all the approaches the same set of configurations asused during 10-CV performance evaluation were used forresult generation on 30 unseen stratified test data subsetwhose results are presented in Table 4(e best performanceon both 10-CV results (Table 2) and test dataset (Table 4)

Input data subset with total Arsquo predictors for trainingNo of tree NtreeAllowed maximum no of splits SmaxRandom no of predictorsfeatures f

Output estimated location LxFor system trainingStep 1 for l 1 to Ntree

(i) Choose a bootstrap sample set (SS) of size (Nsample) with replacement from the training data subset(ii) Generate a Random Forest Tree (Tl) to SS via recursively iterating (a-c) for every terminal node of tree unless the maximum no of

splits (Smax) is reached

(a) Randomly select f featuresvariables from the Arsquo predictors (fltltArsquo)(b) Choose the best featuressplit-point from the f employing Gini Index(c) Split node forming into two children nodes

Step 2 Produce the resulting ensemble of trees Tl1Ntree

For location prediction at a new point x from RFE LrfNtreeLet Lm(x) be the roomclass prediction by the mth RFE treeLrfNtree(x)maj vote Lm(x)1Ntree

ALGORITHM 2 RFE classification algorithm for training and prediction

Mobile Information Systems 7

were obtained by CEnsLoc with 97 and 95 accuracyfollowed by FURIA (92 90) k-NN (91 89) Klowast (9087) ANN (85 82) and DeepLearning4J (73 71)respectively e results are presented on both 10-CV andunseen test data subset to show that for a small dataset 10-CV results can provide a good performance estimate asresults on the test data subset were found to be showingsimilar trends as depicted by the 10-CV results Althoughdeep leaning and ANN provide good performance resultsbut nding optimal conguration for a huge number oftunable parameters is a tricky task Moreover during ex-periments it was found that despite generic guidelines forparameter tuning the performance measures and conver-gence rate of ANN as well as deep learning schemes highlynotuctuate with even a slight variation in the number ofneurons in layers training algorithm and number of hiddenlayers which makes it harder to nd optimal congurationfor ANN and deep learning models Lazy and instance-basedapproaches such as k-NN and Klowast are also good candidates

for indoor localization however they have a very limitednumber of tunable parameters resulting in inability tosurpass CEnsLoc performance despite trying dierent pa-rameter combinations ey do not generalize well ascompared to RFE as the end result of prediction is heavilydependent on the majority vote by k closest matchedsamples in the dataset ey also need template matchingwith the entire dataset for one location prediction whichresults in growing response time with increasing numberof samples in the dataset which is highly likely in practicalreal-world scenarios As in a typical building there area quite huge number of visible APs and a suciently largenumber of samples are also required for classiers to workproperly

A minimum response time of 205Eminus 05 seconds wasobtained by FURIA DeepLearning4J stood second with682Eminus 05 seconds response time ANN k-NN and CEn-sLoc all had response time on the scale of Eminus 04 secondswhich is 10 times slower than aforementioned two

L1 L2 L3

L4

L5

L6

L7

L8

L10L9 L12

L11

L12

L12

L12

L12

Figure 3 Room labels and marked RPs

446 446717

2700 2700 2700 2700

1446

250743 743

4496

0500

100015002000250030003500400045005000

Number of samplesLocations

Num

ber o

f sam

ples

L1L2L3

L4L5L6

L7L8L9

L10L11L12

Figure 4 Number of samples per room

8 Mobile Information Systems

approaches but the difference was trifling which cannot bedetected by any human user of the system the accuracyprecision and recall provided by CEnsLoc was much greaterthan all other approaches

FURIA stands out as the second best performer re-garding indoor localization which is an upgraded version ofthe RIPPER algorithm indicating that rule-based algo-rithms specially fuzzified versions with good generalizationcapabilities perform well for location estimation as wellHowever it lagged behind CEnsLoc in terms of accuracyprecision and recall by 5 10 and 15 respectively

(e details of both 10-CV and test dataset results for allIPS compared and CEnsLoc are depicted in Figure 5 for sideby side visual comparison

52 Out-of-Bag (OOB) Error Results (ere is a performancemeasure called OOB error which is peculiar to RFE Duringtraining of RFE data subsets are generated with re-placement resulting in some repeated and left out obser-vations (OOB observations) for each tree (at particulartree does not train on these left out OOB observationsPrediction capability of RFE can be measured using ldquoOOBLossrdquo which is the error made on unseen OOB observations

during training OOB loss measure has been investigatedand shown to provide an upper bound on testing error [37]specifically useful for small-sized datasets Hence OOBerror can be used just likeinstead of unseen test data subsetif the available dataset is small or unseen test dataset isunavailable It provides a very good estimate of the trainedclassifierrsquos generalization capability Table 5 summarizes theOOB loss compared with averaged out 10-CV loss indicatingthat it indeed bounds it

6 Conclusion and Future Work

Location predictionestimation provides derivation ofmeaningful context for a broad range of services and ap-plications Indoor localization can open altogether a pleth-ora of new opportunities because humans spend most oftheir time indoors CEnsLoc offers shorter response timeand an overall improvement in accuracy precision andrecall With only a few parameters to be tuned it is suited forFP-based localization which requires frequent recollection ofdata and retraining CEnsLoc was able to attain 97 ac-curacy in comparison with other IPS averaged over 6 dif-ferent configurations namely FURIA k-NN Klowast ANN andDeepLearning4J with 92 91 90 85 and 73 ac-curacies respectively It can be utilized for elderly assistancenavigation smart buildings and smart transportation toname a few potential applications

Our future work includes deployment of CEnsLoc acrossa wide range of civil infrastructures including different floorsandor buildings to understand its performance in moredetail as well as its scalability using crowdsourcing We alsoaim to build safety security and evacuation guide appli-cations with CEnsLoc at their core for users in officesuniversities and retail Furthermore integration with GPSBluetooth and PDR to further enhance accuracy utilizing

Table 2 Tenfold cross-validated performance evaluation and comparison of CEnsLoc

Accuracy Precision Recallk-NN 091 088 087ANN 085 083 078Klowast 090 096 062FURIA 092 089 082DeepLearning4J 073 051 041CEnsLoc 097 098 097

Table 3 Tenfold cross-validated performance comparison of CEnsLoc in terms of training and response time (seconds)

Time (sec) Training time (10-fold) Avg training time (1-fold) Response time dataset Response time (1 sample)k-NN mdash mdash 197 170Eminus 04ANN 26720 2672 018 141Eminus 04Klowast mdash mdash 10309 817Eminus 02FURIA 1587 158 003 205E2 05DeepLearning4J 75559 755 009 682Eminus 05CEnsLoc 14007 14007 235 208Eminus 04

Table 4 Performance evaluation and comparison of CEnsLoc ontest subset

Accuracy Precision Recallk-NN 089 087 085ANN 082 081 076Klowast 087 094 060FURIA 090 086 079DeepLearning4J 071 048 039CEnsLoc 095 096 094

Mobile Information Systems 9

available hybrid technologies at a given time is also part ofthe plan

Data Availability

e data used to support the ndings of this study areavailable from the corresponding author upon request

Conflicts of Interest

e authors declare that they have no connoticts of interest

Acknowledgments

is research was supported by Basic Science ResearchProgram through the National Research Foundation ofKorea (NRF) funded by the Ministry of Education (NRF-2018R1D1A1B07048697)

References

[1] J Torres-Sospedra R Montoliu S Trilles O Belmonte andJ Huerta ldquoComprehensive analysis of distance and similaritymeasures for Wi-Fi ngerprinting indoor positioning sys-temsrdquo Expert Systems with Applications vol 42 no 23pp 9263ndash9278 2015

[2] H Mehmood and N K Tripathi ldquoCascading articial neuralnetworks optimized by genetic algorithms and integrated withglobal navigation satellite system to oer accurate ubiquitouspositioning in urban environmentrdquo Computers Environmentand Urban Systems vol 37 no 1 pp 35ndash44 2013

[3] M M Soltani A Motamedi and A Hammad ldquoEnhancingcluster-based RFID tag localization using articial neuralnetworks and virtual reference tagsrdquo Automation in Con-struction vol 54 pp 93ndash105 2015

[4] M L Rodrigues L F M Vieira and M F M CamposldquoFingerprinting-based radio localization in indoor environ-ments using multiple wireless technologiesrdquo in Proceedings ofIEEE International Symposium on Personal Indoor andMobile Radio Communications pp 1203ndash1207 Toronto ONCanada September 2011

[5] J Luo and H Gao ldquoDeep belief networks for ngerprintingindoor localization using ultrawideband technologyrdquo In-ternational Journal of Distributed Sensor Networks vol 12no 1 article 5840916 2016

[6] S Saeedi L Paull M Trentini and H Li ldquoNeural network-based multiple robot simultaneous localization and map-pingrdquo IEEE Transactions on Neural Networks vol 22 no 12pp 880ndash885 2011

[7] M Azizyan R R Choudhury and I Constandache ldquoSur-roundSense mobile phone localization via ambience nger-printingrdquo in Proceedings of the 15th annual internationalconference on Mobile computing and networking-Mobi-Comrsquo09 pp 261ndash272 Beijing China 2009

[8] L Pei R Chen J Liu et al ldquoMotion Recognition AssistedIndoor Wireless Navigation on a Mobile Phonerdquo in Pro-ceedings of the 23rd International Technical Meeting of eSatellite Division of the Institute of Navigation pp 3366ndash3375Portland OR USA September 2010

[9] P S Nagpal and R Rashidzadeh ldquoIndoor positioning usingmagnetic compass and accelerometer of smartphonesrdquo inProceedings of International Conference on Selected Topics inMobile and Wireless Networking (MoWNeT) pp 140ndash145Montreal Canada 2013

[10] J Menke and A Zakhor ldquoMulti-modal indoor positioning ofmobile devicesrdquo in Proceedings of IEEE International Con-ference on Indoor Positioning and Indoor Navigationpp 13ndash16 Alberta Canada October 2015

[11] M Oussalah M Alakhras and M I Hussein ldquoMultivariablefuzzy inference system for ngerprinting indoor localizationrdquoFuzzy Sets and Systems vol 269 pp 65ndash89 2015

[12] N Li J Chen Y Yuan X Tian Y Han and M Xia ldquoA wi-indoor localization strategy using particle swarm optimizationbased articial neural networksrdquo International Journal ofDistributed Sensor Networks vol 12 no 3 article 45831472016

002040608

112

k-N

N

4-la

yer A

NN

(SCG

)

4-la

yer A

NN

(RBP

)

3-la

yer A

NN

(SCG

)

3-la

yer A

NN

(RBP

)

2-la

yer A

NN

(SCG

)

2-la

yer A

NN

(RBP

)

Klowast

FURI

A

Dee

pLea

rnin

g4J

CEns

Loc

k-N

N

4-la

yer A

NN

(SCG

)

4-la

yer A

NN

(RBP

)

3-la

yer A

NN

(SCG

)

3-la

yer A

NN

(RBP

)

2-la

yer A

NN

(SCG

)

2-la

yer A

NN

(RBP

)

Klowast

FURI

A

Dee

pLea

rnin

g4J

CEns

Loc

10-CV Test dataset

PrecisionRecallAccuracy

Figure 5 Performance measures on both 10-CV and test dataset

Table 5 OOB loss

Average 10-fold loss OOB loss00292813 00300123

10 Mobile Information Systems

[13] C Song J Wang and G Yuan ldquoHidden naive bayes indoorfingerprinting localization based on best-discriminating APselectionrdquo ISPRS International Journal of Geo-Informationvol 5 no 10 p 189 2016

[14] P Bahl and V N Padmanabhan ldquoRADAR an in-building RFbased user location and tracking systemrdquo in Proceedings ofIEEE INFOCOM 2000 Conference on Computer Communi-cations Nineteenth Annual Joint Conference of the IEEEComputer and Communications Societies (Cat No00CH37064) vol 2 pp 775ndash784 Tel Aviv Israel March 2000

[15] M Cooper J Biehl G Filby and S Kratz ldquoLoCo boosting forindoor location classification combining Wi-Fi and BLErdquoPersonal and Ubiquitous Computing vol 20 no 1 pp 1ndash142016

[16] C Wang F Wu Z Shi and D Zhang ldquoIndoor positioningtechnique by combining RFID and particle swarmoptimization-based back propagation neural networkrdquo Optik(Stuttg) vol 127 no 17 pp 6839ndash6849 2016

[17] J Xu H Dai and W Ying ldquoMulti-layer neural network forreceived signal strength-based indoor localisationrdquo IETCommunications vol 10 no 6 pp 717ndash723 2016

[18] W Zhang K Liu W Zhang Y Zhang and J Gu ldquoDeepneural networks for wireless localization in indoor andoutdoor environmentsrdquo Neurocomputing vol 194 pp 279ndash287 2016

[19] L Calderoni M Ferrara A Franco and D Maio ldquoIndoorlocalization in a hospital environment using Random Forestclassifiersrdquo Expert Systems with Applications vol 42 no 1pp 125ndash134 2015

[20] E Jedari Z Wu R Rashidzadeh and M Saif ldquoWi-Fi basedindoor location positioning employing random forest clas-sifierrdquo in Proceedings of 2015 International Conference onIndoor Positioning and Indoor Navigation (IPIN) pp 13ndash16Banff Albeta Canada October 2015

[21] Y Mo Z Zhang Y Lu W Meng and G Agha ldquoRandomforest based coarse locating and KPCA feature extraction forindoor positioning systemrdquo Mathematical Problems in En-gineering vol 2014 Article ID 850926 8 pages 2014

[22] R Gorak and M Luckner ldquoMalfunction immune WindashFilocalisationmethodrdquo in Computational Collective Intelligencepp 328ndash337 Springer Cham Switzerland 2015

[23] R Gorak and M Luckner ldquoModified random forest algorithmfor WindashFi indoor localization systemrdquo in Proceedings of In-ternational Conference on Computational Collective Intelligencepp 147ndash157 Cham Switzerland September 2016

[24] J Niu B Wang L Cheng and J J P C Rodrigues ldquoWicLocan indoor localization system based on WiFi fingerprints andcrowdsourcingrdquo in Proceedings of 2015 IEEE InternationalConference on Communications (ICC) pp 3008ndash3013 Lon-don UK June 2015

[25] G Ding Z Tan J Zhang and L Zhang ldquoFingerprintinglocalization based on affinity propagation clustering and ar-tificial neural networksrdquo in Proceedings of IEEE WirelessCommunications and Networking Conference (WCNC)pp 2317ndash2322 Shanghai China April 2013

[26] O Leon J Hernandez-Serrano and M Soriano ldquoSecuringcognitive radio networksrdquo International Journal of Commu-nication Systems vol 23 no 5 pp 633ndash652 2010

[27] S Tuncer and T Tuncer ldquoIndoor localization with bluetoothtechnology using artificial neural networksrdquo in Proceedings ofthe IEEE 19th International Conference on Intelligent Engi-neering Systems pp 213ndash217 Bratislava Slovakia September2015

[28] B A Akram A H Akbar B Wajid O Shafiq and A ZafarldquoLocSwayamwar finding a suitable ML algorithm for wi-fifingerprinting based indoor positioning systemrdquo in LectureNotes in Electrical Engineering A Boyaci A Ekti M Aydinand S Yarkan Eds vol 504 Springer Singapore 2018

[29] L Breiman ldquoRandom forestsrdquo Machine Learning vol 45no 1 pp 5ndash32 2001

[30] K Kaji and N Kawaguchi ldquoDesign and implementation ofwifi indoor localization based on Gaussian mixture model andparticle filterrdquo in Proceedings of 2012 International Conferenceon Indoor Positioning and Indoor Navigation (IPIN) pp 1ndash9Sydney Australia November 2012

[31] C Wu Z Yang Y Liu and W Xi ldquoWILL wireless indoorlocalization without site surveyrdquo IEEE Transactions on Par-allel and Distributed Systems vol 24 no 4 pp 839ndash848 2013

[32] S Yang P Dessai M Verma and M Gerla ldquoFreeLoccalibration-free crowdsourced indoor localizationrdquo in Pro-ceedings of IEEE INFOCOM pp 2481ndash2489 Turin Italy April2013

[33] T Seidl and H-P Kriegel ldquoOptimal multi-step k-nearestneighbor searchrdquo ACM SIGMOD Record vol 27 no 2pp 154ndash165 1998

[34] W S Mcculloch and W Pitts ldquoA logical calculus nervousactivityrdquo Bulletin of Mathematical Biology vol 52 no l-2pp 99ndash115 1990

[35] J G Cleary and L E Trigg ldquoKlowast an instance-based learnerusing an entropic distance measurerdquo in Proceedings of TwelfthInternational Conference on Machine Learning vol 5 pp 1ndash14 Tahoe City CA USA July 1995

[36] J Huhn and E Hullermeier ldquoFURIA an algorithm for un-ordered fuzzy rule inductionrdquo Data Mining and KnowledgeDiscovery vol 19 no 3 pp 293ndash319 2009

[37] S Ciss Generalization Error and Out-of-bag Bounds inRandom (Uniform) Forests PhD thesis French UniversityParis France 2015

Mobile Information Systems 11

Computer Games Technology

International Journal of

Hindawiwwwhindawicom Volume 2018

Hindawiwwwhindawicom

Journal ofEngineeringVolume 2018

Advances in

FuzzySystems

Hindawiwwwhindawicom

Volume 2018

International Journal of

ReconfigurableComputing

Hindawiwwwhindawicom Volume 2018

Hindawiwwwhindawicom Volume 2018

Applied Computational Intelligence and Soft Computing

thinspAdvancesthinspinthinsp

thinspArtificial Intelligence

Hindawiwwwhindawicom Volumethinsp2018

Hindawiwwwhindawicom Volume 2018

Civil EngineeringAdvances in

Hindawiwwwhindawicom Volume 2018

Electrical and Computer Engineering

Journal of

Journal of

Computer Networks and Communications

Hindawiwwwhindawicom Volume 2018

Hindawi

wwwhindawicom Volume 2018

Advances in

Multimedia

International Journal of

Biomedical Imaging

Hindawiwwwhindawicom Volume 2018

Hindawiwwwhindawicom Volume 2018

Engineering Mathematics

International Journal of

RoboticsJournal of

Hindawiwwwhindawicom Volume 2018

Hindawiwwwhindawicom Volume 2018

Computational Intelligence and Neuroscience

Hindawiwwwhindawicom Volume 2018

Mathematical Problems in Engineering

Modelling ampSimulationin EngineeringHindawiwwwhindawicom Volume 2018

Hindawi Publishing Corporation httpwwwhindawicom Volume 2013Hindawiwwwhindawicom

The Scientific World Journal

Volume 2018

Hindawiwwwhindawicom Volume 2018

Human-ComputerInteraction

Advances in

Hindawiwwwhindawicom Volume 2018

Scientic Programming

Submit your manuscripts atwwwhindawicom

Page 5: CEnsLoc: Infrastructure-Less Indoor Localization ...downloads.hindawi.com/journals/misy/2018/3287810.pdf · 4.Proposed Localization Methodology 4.1. Problem Formulation. We assume

into subsets using GMM clustering (K data subsets)Equation (4) is a distribution based on 2D Gaussian wheremean is represented by micro and the covariance matrix is sum AGMM with N as the no of overlapping distributions isdescribed by the following equations

N(x | μΣ) 1

2π|Σ|

radic exp minus12(xminus μ)TΣminus1(xminus μ) (4)

P(x) sumN

k1πkN x μkΣk

∣∣∣∣( ) (5)

sumN

k1πk 1 (6)

where πk denes the mixing coecient to express the weightof each mixing element (weighted sum being 1) e re-sultant shape of 2D Gaussian is the average of distributionsindividually in terms of covariance and mixing coecientsAssuming that a linear mix of weighted coecients for eachof the respective distributionrsquos average and covariance isobtained and by incorporating sucient distributionsa nal density function may be obtained e reason behindGMM clustering was the similarity between Gaussian dis-tribution and the radio propagation characteristics of a Wi-Fi AP [30] which makes GMM a highly suitable candidatefor clustering Wi-Fi RSSI vectors

Furthermore each data subset is used for training a RFEto predict the room label as a multiclass classier etrained models of GMM as well as all RFEs are stored forlater use in the prediction phase

e algorithm for training and prediction phases ofCEnsLoc is given in Algorithm 1

Let the total number of samples beNsample in the trainingset the number of trees is Ntree the number of maximumsplits allowed is Smax the number of predictors for theclassier is Arsquo f is the value specifying no of input predictorsthat are utilized to split at a tree node and tc denotes thetotal no of classes in the dataset For nding best split RFEuses Gini Index as given in the following equation where Pj isthe class jrsquos relative frequency in Nsample

Gini Nsample( ) 1minus sumtc

j1Pj( )

2 (7)

RFE is trained using the method presented inAlgorithm 2

43 PredictionPhase e average of collected RSSI FP is fedto CEnsLoc A PCs are computed by the method similar tothe training phase e saved model of GMM is invoked forcluster matching Matched clusterrsquos trained RFE is invokedwhere the nal decision is computed by the majority votedescribed in Algorithm 2

44 Time Complexity of Training and Prediction Phases forCEnsLoc Ceteris paribus the time complexity of thetraining phase and the prediction phase for CEnsLoc isessentially dependent upon the size of the experimental area

One RFE trained for each cluster

subset

Invokepretrainedrespective

RFE

Wi-Fi RSSI FPswith room labels

Estimated room

Training phasePrediction phase

AP1AP2AP3APA

Fingerprint(s)

RSSI1

RSSI2

RSSI3

RSSIA

Li

Lx

Wi-Fi FPs collection Run time observed Wi-Fi FP

PCA

Preprocessing

Generate data subsets

(GMM clustering)

Cluster matching

Cluster selection

Figure 2 Proposed localization methodology (CEnsLoc)

Table 1 A comparison of performance measures of various classication algorithms for Wi-Fi-based position estimation

Algorithm Time to build model (sec) Accuracy Kappa statistic RMSE Precision Recall F1 MCC ROC areaKlowast 0 9952 098 002 099 099 099 099 1k-NN 0 9906 099 003 099 099 099 098 099RFE 111 9876 098 004 098 098 098 098 1FURIA 592 9726 096 005 097 097 097 096 099Multilayer perceptron 2584 9705 096 006 097 097 097 096 099J48 01 9591 095 008 095 095 095 095 098Dl4jMlp classier 2631 94 093 008 094 094 094 093 099SVM 582 906 089 012 093 090 090 09 094Naive Bayes 002 8979 088 012 091 089 089 089 099AdaBoost with decision stump 01 3681 022 025 015 036 021 018 076

Mobile Information Systems 5

our acquisition regimen the resultant dataset of FPs andhow the dataset is manipulated by the tandem of schemes weemploy

441 Time Complexity of Training In the training phase forPCA time complexity is governed by the following equation

O min A3 N

31113872 11138731113872 1113873 (8)

where A no of predictors and N no of observationsFor training a decision tree (DT) that has not been

pruned the expression is as follows

O(A times N log(N)) (9)

As RFE consists of numerous DTs merely a small no f isused from total predictors A Complexity for a single DT inRFE is represented by Equation (10) and the complexity ofNtree by Equation (11)

O(f times N log(N)) (10)

O Ntree times f times N log(N)( 1113857 (11)

whereNtree no of trees in RFE and f random features thatare chosen to get the best split

While trying to control treesrsquo depth grown using Smaxtraining complexity of one RFE is as follows

O Ntree times f times N times Smax( 1113857 (12)

SinceK ensembles are grown for predicting room level timeto train complexity is represented by the following equation

O Ntree times f times N times Smax times K( 1113857 (13)

For GMM the complexity is expressed by the followingequation

O N times K times D3

1113872 1113873 (14)

where N no of samples K no of components andD no of dimensions

Incorporating all the time complexity to train forCEnsLoc is as follows

O min A3 N

31113872 11138731113872 1113873 + O N times K times D

31113872 1113873

+ O Ntree times f times N times Smax times K( 1113857(15)

442 Time Complexity of Prediction For prediction timecomplexity for PCA is given by the following equation

O min A3 N

31113872 11138731113872 1113873 (16)

(e complexity for a DT and an ensemble in terms ofprediction time are shown by Equations (17) and (18)respectively

O(N log(N)) (17)

O Ntree times N log(N)( 1113857 (18)

Smax controls trees depth therefore complexity for anensemble is given by the following equation

O Ntree times N times Smax( 1113857 (19)

Only a single RFE out of K ensembles gets invoked forpredicting a room

(erefore time complexity for GMM is given by thefollowing equation

O K times D3

1113872 1113873 (20)

Finally the overall complexity for CEnsLoc in terms ofprediction is as follows

O min A3 N

31113872 11138731113872 1113873 + O K times D

31113872 1113873 + O Ntree times N times Smax( 1113857

(21)

5 Experimental Results and Discussion

(is section entails hardware equipment software used andparticulars of experiments that were used to evaluate theperformance of CEnsLoc in light of accuracy precisionrecall training and response time

An Intel machine (64-bit Xeon X5650) with a masterclock at 267 GHz with 24GB RAM and 64-bit Windows 10Education was used for experimentation in MATLAB (ereal dataset was developed through FP collection at theground floor of Software Engineering (SE) Centre Uni-versity of Engineering and Technology (UET) LahorePakistan (e buildingrsquos dimensions are 39m times 31m(1209m2) containing offices class rooms laboratories andopen corridors Figures 3 and 4 depict the buildingrsquos floorplan room labels (L1ndashL10 closed rooms L12 open corri-dor and L11 a semiopen room) a total of 180 RPs and the

Input training dataset with total A predictorsMissing value replacement MVrMaximum number of clusters Kmax

Output predicted location LxFor trainingReplace empty values with MVrApply PCA on dataset to generate Arsquo predictorsFor k 1 -gtKmaxGenerate clustersGenerate and save k data subsetsFor each p є k data subsetsTrain p RFE using Algorithm 2 (training)Calculate performance measures

End forEnd forChoose optimal configurationSave respective models for GMM all RFEsFor prediction at a new point xReplace missing values with MVrApply PCA on the FPMatch one cluster CmatchInvoke RFE of Cmatch using Algorithm 2 (prediction)

ALGORITHM 1 CEnsLoc training and prediction algorithm

6 Mobile Information Systems

number of samples collected per room RPs that are rep-resented by small coloured dots in rooms in Figures 3 and 4enlist the total number of samples collected at each locationL1ndashL12 Such notion of rooms was used as walls playinga crucial role in fluctuation of Wi-Fi RSSI values [31 32](e area was planned into a grid of cells of 15times15m2Each cell center was marked as the Reference Point (RP) forFP collection

(e complete dataset consisted of 20087Wi-Fi RSSI FPsin which total 40APs were detected Figure 4 depicts thenumber of FPs captured in each roomlocation marked asL1ndashL12 It must be noted that all these APs belong touniversity infrastructure comprising of SE centre and itsimmediate neighboring buildings (e FPs were pre-processed following the same process described in Sections31 and 32 PCA-based dimension reduction was employedwith optimal results found with 23 PCs(e resultant datasetwas divided with 70 30 stratified ratio for training andtesting subsets (e experimental results are discussed usingboth 10-fold cross validation (10-CV) on training subset andon unseen 30 test subset GMM clustering then partitionedthe training data into further subsets and the optimalconfiguration was two clusters shared covariance kept astrue and diagonal covariance Furthermore one RFE wastrained for each subset with 132 trees 1024 maximum splitsand 8 random features CEnsLoc was hosted at a machineand the mean of observed RSSI FPs was used for run timelocation estimation after applying the same model of PCAand GMM cluster matching finalized from the trainingphase Best matched clusterrsquos respective RFE was invoked forroom prediction Various companion apps can query lo-cation of subscribed users using the IPS configured asa server Response time was computed as an average of thedifference between localization query time and predictiongeneration time (e following formulae were used tocompute the performance parameters

accuracy TP + TN

TP + TN + FP + FN

precision TP

TP + FP

recall TP

TP + FN

(22)

51 Classification Effectiveness and Efficiency Tables 2 and 3summarize the 10-CV performance evaluation of CEnsLoccomparing it with k-NN [33] artificial neural network(ANN) [34] Klowast [35] FURIA [36] and DeepLearning4J [36](e best results are highlighted throughout with boldface k-NN results were computed averaged over six differentconfigurations based on the number of neighbors similaritymeasure employed and neighbor weightage Similarly sixdifferent configurations of ANN namely 2- 3- and 4-layerANNs each having varying number of neurons per layeremploying two learning algorithms SCG and RBP wereaveraged out to obtain the results For Klowast the entropic blendpercentage was varied from 10 to 90 percent (e resultsobtained for FURIA were obtained by varying the count offolds for growth and pruning as well as through varyingminimum instances weight that was for each split Deep-Learning4J results were obtained by varying the number ofneurons per layer number of dense layers in the networkand training algorithm out of many possible variations in thehyper parameters(e results by CEnsLoc were generated bythe found optimal configuration of our proposed approachFor all the approaches the same set of configurations asused during 10-CV performance evaluation were used forresult generation on 30 unseen stratified test data subsetwhose results are presented in Table 4(e best performanceon both 10-CV results (Table 2) and test dataset (Table 4)

Input data subset with total Arsquo predictors for trainingNo of tree NtreeAllowed maximum no of splits SmaxRandom no of predictorsfeatures f

Output estimated location LxFor system trainingStep 1 for l 1 to Ntree

(i) Choose a bootstrap sample set (SS) of size (Nsample) with replacement from the training data subset(ii) Generate a Random Forest Tree (Tl) to SS via recursively iterating (a-c) for every terminal node of tree unless the maximum no of

splits (Smax) is reached

(a) Randomly select f featuresvariables from the Arsquo predictors (fltltArsquo)(b) Choose the best featuressplit-point from the f employing Gini Index(c) Split node forming into two children nodes

Step 2 Produce the resulting ensemble of trees Tl1Ntree

For location prediction at a new point x from RFE LrfNtreeLet Lm(x) be the roomclass prediction by the mth RFE treeLrfNtree(x)maj vote Lm(x)1Ntree

ALGORITHM 2 RFE classification algorithm for training and prediction

Mobile Information Systems 7

were obtained by CEnsLoc with 97 and 95 accuracyfollowed by FURIA (92 90) k-NN (91 89) Klowast (9087) ANN (85 82) and DeepLearning4J (73 71)respectively e results are presented on both 10-CV andunseen test data subset to show that for a small dataset 10-CV results can provide a good performance estimate asresults on the test data subset were found to be showingsimilar trends as depicted by the 10-CV results Althoughdeep leaning and ANN provide good performance resultsbut nding optimal conguration for a huge number oftunable parameters is a tricky task Moreover during ex-periments it was found that despite generic guidelines forparameter tuning the performance measures and conver-gence rate of ANN as well as deep learning schemes highlynotuctuate with even a slight variation in the number ofneurons in layers training algorithm and number of hiddenlayers which makes it harder to nd optimal congurationfor ANN and deep learning models Lazy and instance-basedapproaches such as k-NN and Klowast are also good candidates

for indoor localization however they have a very limitednumber of tunable parameters resulting in inability tosurpass CEnsLoc performance despite trying dierent pa-rameter combinations ey do not generalize well ascompared to RFE as the end result of prediction is heavilydependent on the majority vote by k closest matchedsamples in the dataset ey also need template matchingwith the entire dataset for one location prediction whichresults in growing response time with increasing numberof samples in the dataset which is highly likely in practicalreal-world scenarios As in a typical building there area quite huge number of visible APs and a suciently largenumber of samples are also required for classiers to workproperly

A minimum response time of 205Eminus 05 seconds wasobtained by FURIA DeepLearning4J stood second with682Eminus 05 seconds response time ANN k-NN and CEn-sLoc all had response time on the scale of Eminus 04 secondswhich is 10 times slower than aforementioned two

L1 L2 L3

L4

L5

L6

L7

L8

L10L9 L12

L11

L12

L12

L12

L12

Figure 3 Room labels and marked RPs

446 446717

2700 2700 2700 2700

1446

250743 743

4496

0500

100015002000250030003500400045005000

Number of samplesLocations

Num

ber o

f sam

ples

L1L2L3

L4L5L6

L7L8L9

L10L11L12

Figure 4 Number of samples per room

8 Mobile Information Systems

approaches but the difference was trifling which cannot bedetected by any human user of the system the accuracyprecision and recall provided by CEnsLoc was much greaterthan all other approaches

FURIA stands out as the second best performer re-garding indoor localization which is an upgraded version ofthe RIPPER algorithm indicating that rule-based algo-rithms specially fuzzified versions with good generalizationcapabilities perform well for location estimation as wellHowever it lagged behind CEnsLoc in terms of accuracyprecision and recall by 5 10 and 15 respectively

(e details of both 10-CV and test dataset results for allIPS compared and CEnsLoc are depicted in Figure 5 for sideby side visual comparison

52 Out-of-Bag (OOB) Error Results (ere is a performancemeasure called OOB error which is peculiar to RFE Duringtraining of RFE data subsets are generated with re-placement resulting in some repeated and left out obser-vations (OOB observations) for each tree (at particulartree does not train on these left out OOB observationsPrediction capability of RFE can be measured using ldquoOOBLossrdquo which is the error made on unseen OOB observations

during training OOB loss measure has been investigatedand shown to provide an upper bound on testing error [37]specifically useful for small-sized datasets Hence OOBerror can be used just likeinstead of unseen test data subsetif the available dataset is small or unseen test dataset isunavailable It provides a very good estimate of the trainedclassifierrsquos generalization capability Table 5 summarizes theOOB loss compared with averaged out 10-CV loss indicatingthat it indeed bounds it

6 Conclusion and Future Work

Location predictionestimation provides derivation ofmeaningful context for a broad range of services and ap-plications Indoor localization can open altogether a pleth-ora of new opportunities because humans spend most oftheir time indoors CEnsLoc offers shorter response timeand an overall improvement in accuracy precision andrecall With only a few parameters to be tuned it is suited forFP-based localization which requires frequent recollection ofdata and retraining CEnsLoc was able to attain 97 ac-curacy in comparison with other IPS averaged over 6 dif-ferent configurations namely FURIA k-NN Klowast ANN andDeepLearning4J with 92 91 90 85 and 73 ac-curacies respectively It can be utilized for elderly assistancenavigation smart buildings and smart transportation toname a few potential applications

Our future work includes deployment of CEnsLoc acrossa wide range of civil infrastructures including different floorsandor buildings to understand its performance in moredetail as well as its scalability using crowdsourcing We alsoaim to build safety security and evacuation guide appli-cations with CEnsLoc at their core for users in officesuniversities and retail Furthermore integration with GPSBluetooth and PDR to further enhance accuracy utilizing

Table 2 Tenfold cross-validated performance evaluation and comparison of CEnsLoc

Accuracy Precision Recallk-NN 091 088 087ANN 085 083 078Klowast 090 096 062FURIA 092 089 082DeepLearning4J 073 051 041CEnsLoc 097 098 097

Table 3 Tenfold cross-validated performance comparison of CEnsLoc in terms of training and response time (seconds)

Time (sec) Training time (10-fold) Avg training time (1-fold) Response time dataset Response time (1 sample)k-NN mdash mdash 197 170Eminus 04ANN 26720 2672 018 141Eminus 04Klowast mdash mdash 10309 817Eminus 02FURIA 1587 158 003 205E2 05DeepLearning4J 75559 755 009 682Eminus 05CEnsLoc 14007 14007 235 208Eminus 04

Table 4 Performance evaluation and comparison of CEnsLoc ontest subset

Accuracy Precision Recallk-NN 089 087 085ANN 082 081 076Klowast 087 094 060FURIA 090 086 079DeepLearning4J 071 048 039CEnsLoc 095 096 094

Mobile Information Systems 9

available hybrid technologies at a given time is also part ofthe plan

Data Availability

e data used to support the ndings of this study areavailable from the corresponding author upon request

Conflicts of Interest

e authors declare that they have no connoticts of interest

Acknowledgments

is research was supported by Basic Science ResearchProgram through the National Research Foundation ofKorea (NRF) funded by the Ministry of Education (NRF-2018R1D1A1B07048697)

References

[1] J Torres-Sospedra R Montoliu S Trilles O Belmonte andJ Huerta ldquoComprehensive analysis of distance and similaritymeasures for Wi-Fi ngerprinting indoor positioning sys-temsrdquo Expert Systems with Applications vol 42 no 23pp 9263ndash9278 2015

[2] H Mehmood and N K Tripathi ldquoCascading articial neuralnetworks optimized by genetic algorithms and integrated withglobal navigation satellite system to oer accurate ubiquitouspositioning in urban environmentrdquo Computers Environmentand Urban Systems vol 37 no 1 pp 35ndash44 2013

[3] M M Soltani A Motamedi and A Hammad ldquoEnhancingcluster-based RFID tag localization using articial neuralnetworks and virtual reference tagsrdquo Automation in Con-struction vol 54 pp 93ndash105 2015

[4] M L Rodrigues L F M Vieira and M F M CamposldquoFingerprinting-based radio localization in indoor environ-ments using multiple wireless technologiesrdquo in Proceedings ofIEEE International Symposium on Personal Indoor andMobile Radio Communications pp 1203ndash1207 Toronto ONCanada September 2011

[5] J Luo and H Gao ldquoDeep belief networks for ngerprintingindoor localization using ultrawideband technologyrdquo In-ternational Journal of Distributed Sensor Networks vol 12no 1 article 5840916 2016

[6] S Saeedi L Paull M Trentini and H Li ldquoNeural network-based multiple robot simultaneous localization and map-pingrdquo IEEE Transactions on Neural Networks vol 22 no 12pp 880ndash885 2011

[7] M Azizyan R R Choudhury and I Constandache ldquoSur-roundSense mobile phone localization via ambience nger-printingrdquo in Proceedings of the 15th annual internationalconference on Mobile computing and networking-Mobi-Comrsquo09 pp 261ndash272 Beijing China 2009

[8] L Pei R Chen J Liu et al ldquoMotion Recognition AssistedIndoor Wireless Navigation on a Mobile Phonerdquo in Pro-ceedings of the 23rd International Technical Meeting of eSatellite Division of the Institute of Navigation pp 3366ndash3375Portland OR USA September 2010

[9] P S Nagpal and R Rashidzadeh ldquoIndoor positioning usingmagnetic compass and accelerometer of smartphonesrdquo inProceedings of International Conference on Selected Topics inMobile and Wireless Networking (MoWNeT) pp 140ndash145Montreal Canada 2013

[10] J Menke and A Zakhor ldquoMulti-modal indoor positioning ofmobile devicesrdquo in Proceedings of IEEE International Con-ference on Indoor Positioning and Indoor Navigationpp 13ndash16 Alberta Canada October 2015

[11] M Oussalah M Alakhras and M I Hussein ldquoMultivariablefuzzy inference system for ngerprinting indoor localizationrdquoFuzzy Sets and Systems vol 269 pp 65ndash89 2015

[12] N Li J Chen Y Yuan X Tian Y Han and M Xia ldquoA wi-indoor localization strategy using particle swarm optimizationbased articial neural networksrdquo International Journal ofDistributed Sensor Networks vol 12 no 3 article 45831472016

002040608

112

k-N

N

4-la

yer A

NN

(SCG

)

4-la

yer A

NN

(RBP

)

3-la

yer A

NN

(SCG

)

3-la

yer A

NN

(RBP

)

2-la

yer A

NN

(SCG

)

2-la

yer A

NN

(RBP

)

Klowast

FURI

A

Dee

pLea

rnin

g4J

CEns

Loc

k-N

N

4-la

yer A

NN

(SCG

)

4-la

yer A

NN

(RBP

)

3-la

yer A

NN

(SCG

)

3-la

yer A

NN

(RBP

)

2-la

yer A

NN

(SCG

)

2-la

yer A

NN

(RBP

)

Klowast

FURI

A

Dee

pLea

rnin

g4J

CEns

Loc

10-CV Test dataset

PrecisionRecallAccuracy

Figure 5 Performance measures on both 10-CV and test dataset

Table 5 OOB loss

Average 10-fold loss OOB loss00292813 00300123

10 Mobile Information Systems

[13] C Song J Wang and G Yuan ldquoHidden naive bayes indoorfingerprinting localization based on best-discriminating APselectionrdquo ISPRS International Journal of Geo-Informationvol 5 no 10 p 189 2016

[14] P Bahl and V N Padmanabhan ldquoRADAR an in-building RFbased user location and tracking systemrdquo in Proceedings ofIEEE INFOCOM 2000 Conference on Computer Communi-cations Nineteenth Annual Joint Conference of the IEEEComputer and Communications Societies (Cat No00CH37064) vol 2 pp 775ndash784 Tel Aviv Israel March 2000

[15] M Cooper J Biehl G Filby and S Kratz ldquoLoCo boosting forindoor location classification combining Wi-Fi and BLErdquoPersonal and Ubiquitous Computing vol 20 no 1 pp 1ndash142016

[16] C Wang F Wu Z Shi and D Zhang ldquoIndoor positioningtechnique by combining RFID and particle swarmoptimization-based back propagation neural networkrdquo Optik(Stuttg) vol 127 no 17 pp 6839ndash6849 2016

[17] J Xu H Dai and W Ying ldquoMulti-layer neural network forreceived signal strength-based indoor localisationrdquo IETCommunications vol 10 no 6 pp 717ndash723 2016

[18] W Zhang K Liu W Zhang Y Zhang and J Gu ldquoDeepneural networks for wireless localization in indoor andoutdoor environmentsrdquo Neurocomputing vol 194 pp 279ndash287 2016

[19] L Calderoni M Ferrara A Franco and D Maio ldquoIndoorlocalization in a hospital environment using Random Forestclassifiersrdquo Expert Systems with Applications vol 42 no 1pp 125ndash134 2015

[20] E Jedari Z Wu R Rashidzadeh and M Saif ldquoWi-Fi basedindoor location positioning employing random forest clas-sifierrdquo in Proceedings of 2015 International Conference onIndoor Positioning and Indoor Navigation (IPIN) pp 13ndash16Banff Albeta Canada October 2015

[21] Y Mo Z Zhang Y Lu W Meng and G Agha ldquoRandomforest based coarse locating and KPCA feature extraction forindoor positioning systemrdquo Mathematical Problems in En-gineering vol 2014 Article ID 850926 8 pages 2014

[22] R Gorak and M Luckner ldquoMalfunction immune WindashFilocalisationmethodrdquo in Computational Collective Intelligencepp 328ndash337 Springer Cham Switzerland 2015

[23] R Gorak and M Luckner ldquoModified random forest algorithmfor WindashFi indoor localization systemrdquo in Proceedings of In-ternational Conference on Computational Collective Intelligencepp 147ndash157 Cham Switzerland September 2016

[24] J Niu B Wang L Cheng and J J P C Rodrigues ldquoWicLocan indoor localization system based on WiFi fingerprints andcrowdsourcingrdquo in Proceedings of 2015 IEEE InternationalConference on Communications (ICC) pp 3008ndash3013 Lon-don UK June 2015

[25] G Ding Z Tan J Zhang and L Zhang ldquoFingerprintinglocalization based on affinity propagation clustering and ar-tificial neural networksrdquo in Proceedings of IEEE WirelessCommunications and Networking Conference (WCNC)pp 2317ndash2322 Shanghai China April 2013

[26] O Leon J Hernandez-Serrano and M Soriano ldquoSecuringcognitive radio networksrdquo International Journal of Commu-nication Systems vol 23 no 5 pp 633ndash652 2010

[27] S Tuncer and T Tuncer ldquoIndoor localization with bluetoothtechnology using artificial neural networksrdquo in Proceedings ofthe IEEE 19th International Conference on Intelligent Engi-neering Systems pp 213ndash217 Bratislava Slovakia September2015

[28] B A Akram A H Akbar B Wajid O Shafiq and A ZafarldquoLocSwayamwar finding a suitable ML algorithm for wi-fifingerprinting based indoor positioning systemrdquo in LectureNotes in Electrical Engineering A Boyaci A Ekti M Aydinand S Yarkan Eds vol 504 Springer Singapore 2018

[29] L Breiman ldquoRandom forestsrdquo Machine Learning vol 45no 1 pp 5ndash32 2001

[30] K Kaji and N Kawaguchi ldquoDesign and implementation ofwifi indoor localization based on Gaussian mixture model andparticle filterrdquo in Proceedings of 2012 International Conferenceon Indoor Positioning and Indoor Navigation (IPIN) pp 1ndash9Sydney Australia November 2012

[31] C Wu Z Yang Y Liu and W Xi ldquoWILL wireless indoorlocalization without site surveyrdquo IEEE Transactions on Par-allel and Distributed Systems vol 24 no 4 pp 839ndash848 2013

[32] S Yang P Dessai M Verma and M Gerla ldquoFreeLoccalibration-free crowdsourced indoor localizationrdquo in Pro-ceedings of IEEE INFOCOM pp 2481ndash2489 Turin Italy April2013

[33] T Seidl and H-P Kriegel ldquoOptimal multi-step k-nearestneighbor searchrdquo ACM SIGMOD Record vol 27 no 2pp 154ndash165 1998

[34] W S Mcculloch and W Pitts ldquoA logical calculus nervousactivityrdquo Bulletin of Mathematical Biology vol 52 no l-2pp 99ndash115 1990

[35] J G Cleary and L E Trigg ldquoKlowast an instance-based learnerusing an entropic distance measurerdquo in Proceedings of TwelfthInternational Conference on Machine Learning vol 5 pp 1ndash14 Tahoe City CA USA July 1995

[36] J Huhn and E Hullermeier ldquoFURIA an algorithm for un-ordered fuzzy rule inductionrdquo Data Mining and KnowledgeDiscovery vol 19 no 3 pp 293ndash319 2009

[37] S Ciss Generalization Error and Out-of-bag Bounds inRandom (Uniform) Forests PhD thesis French UniversityParis France 2015

Mobile Information Systems 11

Computer Games Technology

International Journal of

Hindawiwwwhindawicom Volume 2018

Hindawiwwwhindawicom

Journal ofEngineeringVolume 2018

Advances in

FuzzySystems

Hindawiwwwhindawicom

Volume 2018

International Journal of

ReconfigurableComputing

Hindawiwwwhindawicom Volume 2018

Hindawiwwwhindawicom Volume 2018

Applied Computational Intelligence and Soft Computing

thinspAdvancesthinspinthinsp

thinspArtificial Intelligence

Hindawiwwwhindawicom Volumethinsp2018

Hindawiwwwhindawicom Volume 2018

Civil EngineeringAdvances in

Hindawiwwwhindawicom Volume 2018

Electrical and Computer Engineering

Journal of

Journal of

Computer Networks and Communications

Hindawiwwwhindawicom Volume 2018

Hindawi

wwwhindawicom Volume 2018

Advances in

Multimedia

International Journal of

Biomedical Imaging

Hindawiwwwhindawicom Volume 2018

Hindawiwwwhindawicom Volume 2018

Engineering Mathematics

International Journal of

RoboticsJournal of

Hindawiwwwhindawicom Volume 2018

Hindawiwwwhindawicom Volume 2018

Computational Intelligence and Neuroscience

Hindawiwwwhindawicom Volume 2018

Mathematical Problems in Engineering

Modelling ampSimulationin EngineeringHindawiwwwhindawicom Volume 2018

Hindawi Publishing Corporation httpwwwhindawicom Volume 2013Hindawiwwwhindawicom

The Scientific World Journal

Volume 2018

Hindawiwwwhindawicom Volume 2018

Human-ComputerInteraction

Advances in

Hindawiwwwhindawicom Volume 2018

Scientic Programming

Submit your manuscripts atwwwhindawicom

Page 6: CEnsLoc: Infrastructure-Less Indoor Localization ...downloads.hindawi.com/journals/misy/2018/3287810.pdf · 4.Proposed Localization Methodology 4.1. Problem Formulation. We assume

our acquisition regimen the resultant dataset of FPs andhow the dataset is manipulated by the tandem of schemes weemploy

441 Time Complexity of Training In the training phase forPCA time complexity is governed by the following equation

O min A3 N

31113872 11138731113872 1113873 (8)

where A no of predictors and N no of observationsFor training a decision tree (DT) that has not been

pruned the expression is as follows

O(A times N log(N)) (9)

As RFE consists of numerous DTs merely a small no f isused from total predictors A Complexity for a single DT inRFE is represented by Equation (10) and the complexity ofNtree by Equation (11)

O(f times N log(N)) (10)

O Ntree times f times N log(N)( 1113857 (11)

whereNtree no of trees in RFE and f random features thatare chosen to get the best split

While trying to control treesrsquo depth grown using Smaxtraining complexity of one RFE is as follows

O Ntree times f times N times Smax( 1113857 (12)

SinceK ensembles are grown for predicting room level timeto train complexity is represented by the following equation

O Ntree times f times N times Smax times K( 1113857 (13)

For GMM the complexity is expressed by the followingequation

O N times K times D3

1113872 1113873 (14)

where N no of samples K no of components andD no of dimensions

Incorporating all the time complexity to train forCEnsLoc is as follows

O min A3 N

31113872 11138731113872 1113873 + O N times K times D

31113872 1113873

+ O Ntree times f times N times Smax times K( 1113857(15)

442 Time Complexity of Prediction For prediction timecomplexity for PCA is given by the following equation

O min A3 N

31113872 11138731113872 1113873 (16)

(e complexity for a DT and an ensemble in terms ofprediction time are shown by Equations (17) and (18)respectively

O(N log(N)) (17)

O Ntree times N log(N)( 1113857 (18)

Smax controls trees depth therefore complexity for anensemble is given by the following equation

O Ntree times N times Smax( 1113857 (19)

Only a single RFE out of K ensembles gets invoked forpredicting a room

(erefore time complexity for GMM is given by thefollowing equation

O K times D3

1113872 1113873 (20)

Finally the overall complexity for CEnsLoc in terms ofprediction is as follows

O min A3 N

31113872 11138731113872 1113873 + O K times D

31113872 1113873 + O Ntree times N times Smax( 1113857

(21)

5 Experimental Results and Discussion

(is section entails hardware equipment software used andparticulars of experiments that were used to evaluate theperformance of CEnsLoc in light of accuracy precisionrecall training and response time

An Intel machine (64-bit Xeon X5650) with a masterclock at 267 GHz with 24GB RAM and 64-bit Windows 10Education was used for experimentation in MATLAB (ereal dataset was developed through FP collection at theground floor of Software Engineering (SE) Centre Uni-versity of Engineering and Technology (UET) LahorePakistan (e buildingrsquos dimensions are 39m times 31m(1209m2) containing offices class rooms laboratories andopen corridors Figures 3 and 4 depict the buildingrsquos floorplan room labels (L1ndashL10 closed rooms L12 open corri-dor and L11 a semiopen room) a total of 180 RPs and the

Input training dataset with total A predictorsMissing value replacement MVrMaximum number of clusters Kmax

Output predicted location LxFor trainingReplace empty values with MVrApply PCA on dataset to generate Arsquo predictorsFor k 1 -gtKmaxGenerate clustersGenerate and save k data subsetsFor each p є k data subsetsTrain p RFE using Algorithm 2 (training)Calculate performance measures

End forEnd forChoose optimal configurationSave respective models for GMM all RFEsFor prediction at a new point xReplace missing values with MVrApply PCA on the FPMatch one cluster CmatchInvoke RFE of Cmatch using Algorithm 2 (prediction)

ALGORITHM 1 CEnsLoc training and prediction algorithm

6 Mobile Information Systems

number of samples collected per room RPs that are rep-resented by small coloured dots in rooms in Figures 3 and 4enlist the total number of samples collected at each locationL1ndashL12 Such notion of rooms was used as walls playinga crucial role in fluctuation of Wi-Fi RSSI values [31 32](e area was planned into a grid of cells of 15times15m2Each cell center was marked as the Reference Point (RP) forFP collection

(e complete dataset consisted of 20087Wi-Fi RSSI FPsin which total 40APs were detected Figure 4 depicts thenumber of FPs captured in each roomlocation marked asL1ndashL12 It must be noted that all these APs belong touniversity infrastructure comprising of SE centre and itsimmediate neighboring buildings (e FPs were pre-processed following the same process described in Sections31 and 32 PCA-based dimension reduction was employedwith optimal results found with 23 PCs(e resultant datasetwas divided with 70 30 stratified ratio for training andtesting subsets (e experimental results are discussed usingboth 10-fold cross validation (10-CV) on training subset andon unseen 30 test subset GMM clustering then partitionedthe training data into further subsets and the optimalconfiguration was two clusters shared covariance kept astrue and diagonal covariance Furthermore one RFE wastrained for each subset with 132 trees 1024 maximum splitsand 8 random features CEnsLoc was hosted at a machineand the mean of observed RSSI FPs was used for run timelocation estimation after applying the same model of PCAand GMM cluster matching finalized from the trainingphase Best matched clusterrsquos respective RFE was invoked forroom prediction Various companion apps can query lo-cation of subscribed users using the IPS configured asa server Response time was computed as an average of thedifference between localization query time and predictiongeneration time (e following formulae were used tocompute the performance parameters

accuracy TP + TN

TP + TN + FP + FN

precision TP

TP + FP

recall TP

TP + FN

(22)

51 Classification Effectiveness and Efficiency Tables 2 and 3summarize the 10-CV performance evaluation of CEnsLoccomparing it with k-NN [33] artificial neural network(ANN) [34] Klowast [35] FURIA [36] and DeepLearning4J [36](e best results are highlighted throughout with boldface k-NN results were computed averaged over six differentconfigurations based on the number of neighbors similaritymeasure employed and neighbor weightage Similarly sixdifferent configurations of ANN namely 2- 3- and 4-layerANNs each having varying number of neurons per layeremploying two learning algorithms SCG and RBP wereaveraged out to obtain the results For Klowast the entropic blendpercentage was varied from 10 to 90 percent (e resultsobtained for FURIA were obtained by varying the count offolds for growth and pruning as well as through varyingminimum instances weight that was for each split Deep-Learning4J results were obtained by varying the number ofneurons per layer number of dense layers in the networkand training algorithm out of many possible variations in thehyper parameters(e results by CEnsLoc were generated bythe found optimal configuration of our proposed approachFor all the approaches the same set of configurations asused during 10-CV performance evaluation were used forresult generation on 30 unseen stratified test data subsetwhose results are presented in Table 4(e best performanceon both 10-CV results (Table 2) and test dataset (Table 4)

Input data subset with total Arsquo predictors for trainingNo of tree NtreeAllowed maximum no of splits SmaxRandom no of predictorsfeatures f

Output estimated location LxFor system trainingStep 1 for l 1 to Ntree

(i) Choose a bootstrap sample set (SS) of size (Nsample) with replacement from the training data subset(ii) Generate a Random Forest Tree (Tl) to SS via recursively iterating (a-c) for every terminal node of tree unless the maximum no of

splits (Smax) is reached

(a) Randomly select f featuresvariables from the Arsquo predictors (fltltArsquo)(b) Choose the best featuressplit-point from the f employing Gini Index(c) Split node forming into two children nodes

Step 2 Produce the resulting ensemble of trees Tl1Ntree

For location prediction at a new point x from RFE LrfNtreeLet Lm(x) be the roomclass prediction by the mth RFE treeLrfNtree(x)maj vote Lm(x)1Ntree

ALGORITHM 2 RFE classification algorithm for training and prediction

Mobile Information Systems 7

were obtained by CEnsLoc with 97 and 95 accuracyfollowed by FURIA (92 90) k-NN (91 89) Klowast (9087) ANN (85 82) and DeepLearning4J (73 71)respectively e results are presented on both 10-CV andunseen test data subset to show that for a small dataset 10-CV results can provide a good performance estimate asresults on the test data subset were found to be showingsimilar trends as depicted by the 10-CV results Althoughdeep leaning and ANN provide good performance resultsbut nding optimal conguration for a huge number oftunable parameters is a tricky task Moreover during ex-periments it was found that despite generic guidelines forparameter tuning the performance measures and conver-gence rate of ANN as well as deep learning schemes highlynotuctuate with even a slight variation in the number ofneurons in layers training algorithm and number of hiddenlayers which makes it harder to nd optimal congurationfor ANN and deep learning models Lazy and instance-basedapproaches such as k-NN and Klowast are also good candidates

for indoor localization however they have a very limitednumber of tunable parameters resulting in inability tosurpass CEnsLoc performance despite trying dierent pa-rameter combinations ey do not generalize well ascompared to RFE as the end result of prediction is heavilydependent on the majority vote by k closest matchedsamples in the dataset ey also need template matchingwith the entire dataset for one location prediction whichresults in growing response time with increasing numberof samples in the dataset which is highly likely in practicalreal-world scenarios As in a typical building there area quite huge number of visible APs and a suciently largenumber of samples are also required for classiers to workproperly

A minimum response time of 205Eminus 05 seconds wasobtained by FURIA DeepLearning4J stood second with682Eminus 05 seconds response time ANN k-NN and CEn-sLoc all had response time on the scale of Eminus 04 secondswhich is 10 times slower than aforementioned two

L1 L2 L3

L4

L5

L6

L7

L8

L10L9 L12

L11

L12

L12

L12

L12

Figure 3 Room labels and marked RPs

446 446717

2700 2700 2700 2700

1446

250743 743

4496

0500

100015002000250030003500400045005000

Number of samplesLocations

Num

ber o

f sam

ples

L1L2L3

L4L5L6

L7L8L9

L10L11L12

Figure 4 Number of samples per room

8 Mobile Information Systems

approaches but the difference was trifling which cannot bedetected by any human user of the system the accuracyprecision and recall provided by CEnsLoc was much greaterthan all other approaches

FURIA stands out as the second best performer re-garding indoor localization which is an upgraded version ofthe RIPPER algorithm indicating that rule-based algo-rithms specially fuzzified versions with good generalizationcapabilities perform well for location estimation as wellHowever it lagged behind CEnsLoc in terms of accuracyprecision and recall by 5 10 and 15 respectively

(e details of both 10-CV and test dataset results for allIPS compared and CEnsLoc are depicted in Figure 5 for sideby side visual comparison

52 Out-of-Bag (OOB) Error Results (ere is a performancemeasure called OOB error which is peculiar to RFE Duringtraining of RFE data subsets are generated with re-placement resulting in some repeated and left out obser-vations (OOB observations) for each tree (at particulartree does not train on these left out OOB observationsPrediction capability of RFE can be measured using ldquoOOBLossrdquo which is the error made on unseen OOB observations

during training OOB loss measure has been investigatedand shown to provide an upper bound on testing error [37]specifically useful for small-sized datasets Hence OOBerror can be used just likeinstead of unseen test data subsetif the available dataset is small or unseen test dataset isunavailable It provides a very good estimate of the trainedclassifierrsquos generalization capability Table 5 summarizes theOOB loss compared with averaged out 10-CV loss indicatingthat it indeed bounds it

6 Conclusion and Future Work

Location predictionestimation provides derivation ofmeaningful context for a broad range of services and ap-plications Indoor localization can open altogether a pleth-ora of new opportunities because humans spend most oftheir time indoors CEnsLoc offers shorter response timeand an overall improvement in accuracy precision andrecall With only a few parameters to be tuned it is suited forFP-based localization which requires frequent recollection ofdata and retraining CEnsLoc was able to attain 97 ac-curacy in comparison with other IPS averaged over 6 dif-ferent configurations namely FURIA k-NN Klowast ANN andDeepLearning4J with 92 91 90 85 and 73 ac-curacies respectively It can be utilized for elderly assistancenavigation smart buildings and smart transportation toname a few potential applications

Our future work includes deployment of CEnsLoc acrossa wide range of civil infrastructures including different floorsandor buildings to understand its performance in moredetail as well as its scalability using crowdsourcing We alsoaim to build safety security and evacuation guide appli-cations with CEnsLoc at their core for users in officesuniversities and retail Furthermore integration with GPSBluetooth and PDR to further enhance accuracy utilizing

Table 2 Tenfold cross-validated performance evaluation and comparison of CEnsLoc

Accuracy Precision Recallk-NN 091 088 087ANN 085 083 078Klowast 090 096 062FURIA 092 089 082DeepLearning4J 073 051 041CEnsLoc 097 098 097

Table 3 Tenfold cross-validated performance comparison of CEnsLoc in terms of training and response time (seconds)

Time (sec) Training time (10-fold) Avg training time (1-fold) Response time dataset Response time (1 sample)k-NN mdash mdash 197 170Eminus 04ANN 26720 2672 018 141Eminus 04Klowast mdash mdash 10309 817Eminus 02FURIA 1587 158 003 205E2 05DeepLearning4J 75559 755 009 682Eminus 05CEnsLoc 14007 14007 235 208Eminus 04

Table 4 Performance evaluation and comparison of CEnsLoc ontest subset

Accuracy Precision Recallk-NN 089 087 085ANN 082 081 076Klowast 087 094 060FURIA 090 086 079DeepLearning4J 071 048 039CEnsLoc 095 096 094

Mobile Information Systems 9

available hybrid technologies at a given time is also part ofthe plan

Data Availability

e data used to support the ndings of this study areavailable from the corresponding author upon request

Conflicts of Interest

e authors declare that they have no connoticts of interest

Acknowledgments

is research was supported by Basic Science ResearchProgram through the National Research Foundation ofKorea (NRF) funded by the Ministry of Education (NRF-2018R1D1A1B07048697)

References

[1] J Torres-Sospedra R Montoliu S Trilles O Belmonte andJ Huerta ldquoComprehensive analysis of distance and similaritymeasures for Wi-Fi ngerprinting indoor positioning sys-temsrdquo Expert Systems with Applications vol 42 no 23pp 9263ndash9278 2015

[2] H Mehmood and N K Tripathi ldquoCascading articial neuralnetworks optimized by genetic algorithms and integrated withglobal navigation satellite system to oer accurate ubiquitouspositioning in urban environmentrdquo Computers Environmentand Urban Systems vol 37 no 1 pp 35ndash44 2013

[3] M M Soltani A Motamedi and A Hammad ldquoEnhancingcluster-based RFID tag localization using articial neuralnetworks and virtual reference tagsrdquo Automation in Con-struction vol 54 pp 93ndash105 2015

[4] M L Rodrigues L F M Vieira and M F M CamposldquoFingerprinting-based radio localization in indoor environ-ments using multiple wireless technologiesrdquo in Proceedings ofIEEE International Symposium on Personal Indoor andMobile Radio Communications pp 1203ndash1207 Toronto ONCanada September 2011

[5] J Luo and H Gao ldquoDeep belief networks for ngerprintingindoor localization using ultrawideband technologyrdquo In-ternational Journal of Distributed Sensor Networks vol 12no 1 article 5840916 2016

[6] S Saeedi L Paull M Trentini and H Li ldquoNeural network-based multiple robot simultaneous localization and map-pingrdquo IEEE Transactions on Neural Networks vol 22 no 12pp 880ndash885 2011

[7] M Azizyan R R Choudhury and I Constandache ldquoSur-roundSense mobile phone localization via ambience nger-printingrdquo in Proceedings of the 15th annual internationalconference on Mobile computing and networking-Mobi-Comrsquo09 pp 261ndash272 Beijing China 2009

[8] L Pei R Chen J Liu et al ldquoMotion Recognition AssistedIndoor Wireless Navigation on a Mobile Phonerdquo in Pro-ceedings of the 23rd International Technical Meeting of eSatellite Division of the Institute of Navigation pp 3366ndash3375Portland OR USA September 2010

[9] P S Nagpal and R Rashidzadeh ldquoIndoor positioning usingmagnetic compass and accelerometer of smartphonesrdquo inProceedings of International Conference on Selected Topics inMobile and Wireless Networking (MoWNeT) pp 140ndash145Montreal Canada 2013

[10] J Menke and A Zakhor ldquoMulti-modal indoor positioning ofmobile devicesrdquo in Proceedings of IEEE International Con-ference on Indoor Positioning and Indoor Navigationpp 13ndash16 Alberta Canada October 2015

[11] M Oussalah M Alakhras and M I Hussein ldquoMultivariablefuzzy inference system for ngerprinting indoor localizationrdquoFuzzy Sets and Systems vol 269 pp 65ndash89 2015

[12] N Li J Chen Y Yuan X Tian Y Han and M Xia ldquoA wi-indoor localization strategy using particle swarm optimizationbased articial neural networksrdquo International Journal ofDistributed Sensor Networks vol 12 no 3 article 45831472016

002040608

112

k-N

N

4-la

yer A

NN

(SCG

)

4-la

yer A

NN

(RBP

)

3-la

yer A

NN

(SCG

)

3-la

yer A

NN

(RBP

)

2-la

yer A

NN

(SCG

)

2-la

yer A

NN

(RBP

)

Klowast

FURI

A

Dee

pLea

rnin

g4J

CEns

Loc

k-N

N

4-la

yer A

NN

(SCG

)

4-la

yer A

NN

(RBP

)

3-la

yer A

NN

(SCG

)

3-la

yer A

NN

(RBP

)

2-la

yer A

NN

(SCG

)

2-la

yer A

NN

(RBP

)

Klowast

FURI

A

Dee

pLea

rnin

g4J

CEns

Loc

10-CV Test dataset

PrecisionRecallAccuracy

Figure 5 Performance measures on both 10-CV and test dataset

Table 5 OOB loss

Average 10-fold loss OOB loss00292813 00300123

10 Mobile Information Systems

[13] C Song J Wang and G Yuan ldquoHidden naive bayes indoorfingerprinting localization based on best-discriminating APselectionrdquo ISPRS International Journal of Geo-Informationvol 5 no 10 p 189 2016

[14] P Bahl and V N Padmanabhan ldquoRADAR an in-building RFbased user location and tracking systemrdquo in Proceedings ofIEEE INFOCOM 2000 Conference on Computer Communi-cations Nineteenth Annual Joint Conference of the IEEEComputer and Communications Societies (Cat No00CH37064) vol 2 pp 775ndash784 Tel Aviv Israel March 2000

[15] M Cooper J Biehl G Filby and S Kratz ldquoLoCo boosting forindoor location classification combining Wi-Fi and BLErdquoPersonal and Ubiquitous Computing vol 20 no 1 pp 1ndash142016

[16] C Wang F Wu Z Shi and D Zhang ldquoIndoor positioningtechnique by combining RFID and particle swarmoptimization-based back propagation neural networkrdquo Optik(Stuttg) vol 127 no 17 pp 6839ndash6849 2016

[17] J Xu H Dai and W Ying ldquoMulti-layer neural network forreceived signal strength-based indoor localisationrdquo IETCommunications vol 10 no 6 pp 717ndash723 2016

[18] W Zhang K Liu W Zhang Y Zhang and J Gu ldquoDeepneural networks for wireless localization in indoor andoutdoor environmentsrdquo Neurocomputing vol 194 pp 279ndash287 2016

[19] L Calderoni M Ferrara A Franco and D Maio ldquoIndoorlocalization in a hospital environment using Random Forestclassifiersrdquo Expert Systems with Applications vol 42 no 1pp 125ndash134 2015

[20] E Jedari Z Wu R Rashidzadeh and M Saif ldquoWi-Fi basedindoor location positioning employing random forest clas-sifierrdquo in Proceedings of 2015 International Conference onIndoor Positioning and Indoor Navigation (IPIN) pp 13ndash16Banff Albeta Canada October 2015

[21] Y Mo Z Zhang Y Lu W Meng and G Agha ldquoRandomforest based coarse locating and KPCA feature extraction forindoor positioning systemrdquo Mathematical Problems in En-gineering vol 2014 Article ID 850926 8 pages 2014

[22] R Gorak and M Luckner ldquoMalfunction immune WindashFilocalisationmethodrdquo in Computational Collective Intelligencepp 328ndash337 Springer Cham Switzerland 2015

[23] R Gorak and M Luckner ldquoModified random forest algorithmfor WindashFi indoor localization systemrdquo in Proceedings of In-ternational Conference on Computational Collective Intelligencepp 147ndash157 Cham Switzerland September 2016

[24] J Niu B Wang L Cheng and J J P C Rodrigues ldquoWicLocan indoor localization system based on WiFi fingerprints andcrowdsourcingrdquo in Proceedings of 2015 IEEE InternationalConference on Communications (ICC) pp 3008ndash3013 Lon-don UK June 2015

[25] G Ding Z Tan J Zhang and L Zhang ldquoFingerprintinglocalization based on affinity propagation clustering and ar-tificial neural networksrdquo in Proceedings of IEEE WirelessCommunications and Networking Conference (WCNC)pp 2317ndash2322 Shanghai China April 2013

[26] O Leon J Hernandez-Serrano and M Soriano ldquoSecuringcognitive radio networksrdquo International Journal of Commu-nication Systems vol 23 no 5 pp 633ndash652 2010

[27] S Tuncer and T Tuncer ldquoIndoor localization with bluetoothtechnology using artificial neural networksrdquo in Proceedings ofthe IEEE 19th International Conference on Intelligent Engi-neering Systems pp 213ndash217 Bratislava Slovakia September2015

[28] B A Akram A H Akbar B Wajid O Shafiq and A ZafarldquoLocSwayamwar finding a suitable ML algorithm for wi-fifingerprinting based indoor positioning systemrdquo in LectureNotes in Electrical Engineering A Boyaci A Ekti M Aydinand S Yarkan Eds vol 504 Springer Singapore 2018

[29] L Breiman ldquoRandom forestsrdquo Machine Learning vol 45no 1 pp 5ndash32 2001

[30] K Kaji and N Kawaguchi ldquoDesign and implementation ofwifi indoor localization based on Gaussian mixture model andparticle filterrdquo in Proceedings of 2012 International Conferenceon Indoor Positioning and Indoor Navigation (IPIN) pp 1ndash9Sydney Australia November 2012

[31] C Wu Z Yang Y Liu and W Xi ldquoWILL wireless indoorlocalization without site surveyrdquo IEEE Transactions on Par-allel and Distributed Systems vol 24 no 4 pp 839ndash848 2013

[32] S Yang P Dessai M Verma and M Gerla ldquoFreeLoccalibration-free crowdsourced indoor localizationrdquo in Pro-ceedings of IEEE INFOCOM pp 2481ndash2489 Turin Italy April2013

[33] T Seidl and H-P Kriegel ldquoOptimal multi-step k-nearestneighbor searchrdquo ACM SIGMOD Record vol 27 no 2pp 154ndash165 1998

[34] W S Mcculloch and W Pitts ldquoA logical calculus nervousactivityrdquo Bulletin of Mathematical Biology vol 52 no l-2pp 99ndash115 1990

[35] J G Cleary and L E Trigg ldquoKlowast an instance-based learnerusing an entropic distance measurerdquo in Proceedings of TwelfthInternational Conference on Machine Learning vol 5 pp 1ndash14 Tahoe City CA USA July 1995

[36] J Huhn and E Hullermeier ldquoFURIA an algorithm for un-ordered fuzzy rule inductionrdquo Data Mining and KnowledgeDiscovery vol 19 no 3 pp 293ndash319 2009

[37] S Ciss Generalization Error and Out-of-bag Bounds inRandom (Uniform) Forests PhD thesis French UniversityParis France 2015

Mobile Information Systems 11

Computer Games Technology

International Journal of

Hindawiwwwhindawicom Volume 2018

Hindawiwwwhindawicom

Journal ofEngineeringVolume 2018

Advances in

FuzzySystems

Hindawiwwwhindawicom

Volume 2018

International Journal of

ReconfigurableComputing

Hindawiwwwhindawicom Volume 2018

Hindawiwwwhindawicom Volume 2018

Applied Computational Intelligence and Soft Computing

thinspAdvancesthinspinthinsp

thinspArtificial Intelligence

Hindawiwwwhindawicom Volumethinsp2018

Hindawiwwwhindawicom Volume 2018

Civil EngineeringAdvances in

Hindawiwwwhindawicom Volume 2018

Electrical and Computer Engineering

Journal of

Journal of

Computer Networks and Communications

Hindawiwwwhindawicom Volume 2018

Hindawi

wwwhindawicom Volume 2018

Advances in

Multimedia

International Journal of

Biomedical Imaging

Hindawiwwwhindawicom Volume 2018

Hindawiwwwhindawicom Volume 2018

Engineering Mathematics

International Journal of

RoboticsJournal of

Hindawiwwwhindawicom Volume 2018

Hindawiwwwhindawicom Volume 2018

Computational Intelligence and Neuroscience

Hindawiwwwhindawicom Volume 2018

Mathematical Problems in Engineering

Modelling ampSimulationin EngineeringHindawiwwwhindawicom Volume 2018

Hindawi Publishing Corporation httpwwwhindawicom Volume 2013Hindawiwwwhindawicom

The Scientific World Journal

Volume 2018

Hindawiwwwhindawicom Volume 2018

Human-ComputerInteraction

Advances in

Hindawiwwwhindawicom Volume 2018

Scientic Programming

Submit your manuscripts atwwwhindawicom

Page 7: CEnsLoc: Infrastructure-Less Indoor Localization ...downloads.hindawi.com/journals/misy/2018/3287810.pdf · 4.Proposed Localization Methodology 4.1. Problem Formulation. We assume

number of samples collected per room RPs that are rep-resented by small coloured dots in rooms in Figures 3 and 4enlist the total number of samples collected at each locationL1ndashL12 Such notion of rooms was used as walls playinga crucial role in fluctuation of Wi-Fi RSSI values [31 32](e area was planned into a grid of cells of 15times15m2Each cell center was marked as the Reference Point (RP) forFP collection

(e complete dataset consisted of 20087Wi-Fi RSSI FPsin which total 40APs were detected Figure 4 depicts thenumber of FPs captured in each roomlocation marked asL1ndashL12 It must be noted that all these APs belong touniversity infrastructure comprising of SE centre and itsimmediate neighboring buildings (e FPs were pre-processed following the same process described in Sections31 and 32 PCA-based dimension reduction was employedwith optimal results found with 23 PCs(e resultant datasetwas divided with 70 30 stratified ratio for training andtesting subsets (e experimental results are discussed usingboth 10-fold cross validation (10-CV) on training subset andon unseen 30 test subset GMM clustering then partitionedthe training data into further subsets and the optimalconfiguration was two clusters shared covariance kept astrue and diagonal covariance Furthermore one RFE wastrained for each subset with 132 trees 1024 maximum splitsand 8 random features CEnsLoc was hosted at a machineand the mean of observed RSSI FPs was used for run timelocation estimation after applying the same model of PCAand GMM cluster matching finalized from the trainingphase Best matched clusterrsquos respective RFE was invoked forroom prediction Various companion apps can query lo-cation of subscribed users using the IPS configured asa server Response time was computed as an average of thedifference between localization query time and predictiongeneration time (e following formulae were used tocompute the performance parameters

accuracy TP + TN

TP + TN + FP + FN

precision TP

TP + FP

recall TP

TP + FN

(22)

51 Classification Effectiveness and Efficiency Tables 2 and 3summarize the 10-CV performance evaluation of CEnsLoccomparing it with k-NN [33] artificial neural network(ANN) [34] Klowast [35] FURIA [36] and DeepLearning4J [36](e best results are highlighted throughout with boldface k-NN results were computed averaged over six differentconfigurations based on the number of neighbors similaritymeasure employed and neighbor weightage Similarly sixdifferent configurations of ANN namely 2- 3- and 4-layerANNs each having varying number of neurons per layeremploying two learning algorithms SCG and RBP wereaveraged out to obtain the results For Klowast the entropic blendpercentage was varied from 10 to 90 percent (e resultsobtained for FURIA were obtained by varying the count offolds for growth and pruning as well as through varyingminimum instances weight that was for each split Deep-Learning4J results were obtained by varying the number ofneurons per layer number of dense layers in the networkand training algorithm out of many possible variations in thehyper parameters(e results by CEnsLoc were generated bythe found optimal configuration of our proposed approachFor all the approaches the same set of configurations asused during 10-CV performance evaluation were used forresult generation on 30 unseen stratified test data subsetwhose results are presented in Table 4(e best performanceon both 10-CV results (Table 2) and test dataset (Table 4)

Input data subset with total Arsquo predictors for trainingNo of tree NtreeAllowed maximum no of splits SmaxRandom no of predictorsfeatures f

Output estimated location LxFor system trainingStep 1 for l 1 to Ntree

(i) Choose a bootstrap sample set (SS) of size (Nsample) with replacement from the training data subset(ii) Generate a Random Forest Tree (Tl) to SS via recursively iterating (a-c) for every terminal node of tree unless the maximum no of

splits (Smax) is reached

(a) Randomly select f featuresvariables from the Arsquo predictors (fltltArsquo)(b) Choose the best featuressplit-point from the f employing Gini Index(c) Split node forming into two children nodes

Step 2 Produce the resulting ensemble of trees Tl1Ntree

For location prediction at a new point x from RFE LrfNtreeLet Lm(x) be the roomclass prediction by the mth RFE treeLrfNtree(x)maj vote Lm(x)1Ntree

ALGORITHM 2 RFE classification algorithm for training and prediction

Mobile Information Systems 7

were obtained by CEnsLoc with 97 and 95 accuracyfollowed by FURIA (92 90) k-NN (91 89) Klowast (9087) ANN (85 82) and DeepLearning4J (73 71)respectively e results are presented on both 10-CV andunseen test data subset to show that for a small dataset 10-CV results can provide a good performance estimate asresults on the test data subset were found to be showingsimilar trends as depicted by the 10-CV results Althoughdeep leaning and ANN provide good performance resultsbut nding optimal conguration for a huge number oftunable parameters is a tricky task Moreover during ex-periments it was found that despite generic guidelines forparameter tuning the performance measures and conver-gence rate of ANN as well as deep learning schemes highlynotuctuate with even a slight variation in the number ofneurons in layers training algorithm and number of hiddenlayers which makes it harder to nd optimal congurationfor ANN and deep learning models Lazy and instance-basedapproaches such as k-NN and Klowast are also good candidates

for indoor localization however they have a very limitednumber of tunable parameters resulting in inability tosurpass CEnsLoc performance despite trying dierent pa-rameter combinations ey do not generalize well ascompared to RFE as the end result of prediction is heavilydependent on the majority vote by k closest matchedsamples in the dataset ey also need template matchingwith the entire dataset for one location prediction whichresults in growing response time with increasing numberof samples in the dataset which is highly likely in practicalreal-world scenarios As in a typical building there area quite huge number of visible APs and a suciently largenumber of samples are also required for classiers to workproperly

A minimum response time of 205Eminus 05 seconds wasobtained by FURIA DeepLearning4J stood second with682Eminus 05 seconds response time ANN k-NN and CEn-sLoc all had response time on the scale of Eminus 04 secondswhich is 10 times slower than aforementioned two

L1 L2 L3

L4

L5

L6

L7

L8

L10L9 L12

L11

L12

L12

L12

L12

Figure 3 Room labels and marked RPs

446 446717

2700 2700 2700 2700

1446

250743 743

4496

0500

100015002000250030003500400045005000

Number of samplesLocations

Num

ber o

f sam

ples

L1L2L3

L4L5L6

L7L8L9

L10L11L12

Figure 4 Number of samples per room

8 Mobile Information Systems

approaches but the difference was trifling which cannot bedetected by any human user of the system the accuracyprecision and recall provided by CEnsLoc was much greaterthan all other approaches

FURIA stands out as the second best performer re-garding indoor localization which is an upgraded version ofthe RIPPER algorithm indicating that rule-based algo-rithms specially fuzzified versions with good generalizationcapabilities perform well for location estimation as wellHowever it lagged behind CEnsLoc in terms of accuracyprecision and recall by 5 10 and 15 respectively

(e details of both 10-CV and test dataset results for allIPS compared and CEnsLoc are depicted in Figure 5 for sideby side visual comparison

52 Out-of-Bag (OOB) Error Results (ere is a performancemeasure called OOB error which is peculiar to RFE Duringtraining of RFE data subsets are generated with re-placement resulting in some repeated and left out obser-vations (OOB observations) for each tree (at particulartree does not train on these left out OOB observationsPrediction capability of RFE can be measured using ldquoOOBLossrdquo which is the error made on unseen OOB observations

during training OOB loss measure has been investigatedand shown to provide an upper bound on testing error [37]specifically useful for small-sized datasets Hence OOBerror can be used just likeinstead of unseen test data subsetif the available dataset is small or unseen test dataset isunavailable It provides a very good estimate of the trainedclassifierrsquos generalization capability Table 5 summarizes theOOB loss compared with averaged out 10-CV loss indicatingthat it indeed bounds it

6 Conclusion and Future Work

Location predictionestimation provides derivation ofmeaningful context for a broad range of services and ap-plications Indoor localization can open altogether a pleth-ora of new opportunities because humans spend most oftheir time indoors CEnsLoc offers shorter response timeand an overall improvement in accuracy precision andrecall With only a few parameters to be tuned it is suited forFP-based localization which requires frequent recollection ofdata and retraining CEnsLoc was able to attain 97 ac-curacy in comparison with other IPS averaged over 6 dif-ferent configurations namely FURIA k-NN Klowast ANN andDeepLearning4J with 92 91 90 85 and 73 ac-curacies respectively It can be utilized for elderly assistancenavigation smart buildings and smart transportation toname a few potential applications

Our future work includes deployment of CEnsLoc acrossa wide range of civil infrastructures including different floorsandor buildings to understand its performance in moredetail as well as its scalability using crowdsourcing We alsoaim to build safety security and evacuation guide appli-cations with CEnsLoc at their core for users in officesuniversities and retail Furthermore integration with GPSBluetooth and PDR to further enhance accuracy utilizing

Table 2 Tenfold cross-validated performance evaluation and comparison of CEnsLoc

Accuracy Precision Recallk-NN 091 088 087ANN 085 083 078Klowast 090 096 062FURIA 092 089 082DeepLearning4J 073 051 041CEnsLoc 097 098 097

Table 3 Tenfold cross-validated performance comparison of CEnsLoc in terms of training and response time (seconds)

Time (sec) Training time (10-fold) Avg training time (1-fold) Response time dataset Response time (1 sample)k-NN mdash mdash 197 170Eminus 04ANN 26720 2672 018 141Eminus 04Klowast mdash mdash 10309 817Eminus 02FURIA 1587 158 003 205E2 05DeepLearning4J 75559 755 009 682Eminus 05CEnsLoc 14007 14007 235 208Eminus 04

Table 4 Performance evaluation and comparison of CEnsLoc ontest subset

Accuracy Precision Recallk-NN 089 087 085ANN 082 081 076Klowast 087 094 060FURIA 090 086 079DeepLearning4J 071 048 039CEnsLoc 095 096 094

Mobile Information Systems 9

available hybrid technologies at a given time is also part ofthe plan

Data Availability

e data used to support the ndings of this study areavailable from the corresponding author upon request

Conflicts of Interest

e authors declare that they have no connoticts of interest

Acknowledgments

is research was supported by Basic Science ResearchProgram through the National Research Foundation ofKorea (NRF) funded by the Ministry of Education (NRF-2018R1D1A1B07048697)

References

[1] J Torres-Sospedra R Montoliu S Trilles O Belmonte andJ Huerta ldquoComprehensive analysis of distance and similaritymeasures for Wi-Fi ngerprinting indoor positioning sys-temsrdquo Expert Systems with Applications vol 42 no 23pp 9263ndash9278 2015

[2] H Mehmood and N K Tripathi ldquoCascading articial neuralnetworks optimized by genetic algorithms and integrated withglobal navigation satellite system to oer accurate ubiquitouspositioning in urban environmentrdquo Computers Environmentand Urban Systems vol 37 no 1 pp 35ndash44 2013

[3] M M Soltani A Motamedi and A Hammad ldquoEnhancingcluster-based RFID tag localization using articial neuralnetworks and virtual reference tagsrdquo Automation in Con-struction vol 54 pp 93ndash105 2015

[4] M L Rodrigues L F M Vieira and M F M CamposldquoFingerprinting-based radio localization in indoor environ-ments using multiple wireless technologiesrdquo in Proceedings ofIEEE International Symposium on Personal Indoor andMobile Radio Communications pp 1203ndash1207 Toronto ONCanada September 2011

[5] J Luo and H Gao ldquoDeep belief networks for ngerprintingindoor localization using ultrawideband technologyrdquo In-ternational Journal of Distributed Sensor Networks vol 12no 1 article 5840916 2016

[6] S Saeedi L Paull M Trentini and H Li ldquoNeural network-based multiple robot simultaneous localization and map-pingrdquo IEEE Transactions on Neural Networks vol 22 no 12pp 880ndash885 2011

[7] M Azizyan R R Choudhury and I Constandache ldquoSur-roundSense mobile phone localization via ambience nger-printingrdquo in Proceedings of the 15th annual internationalconference on Mobile computing and networking-Mobi-Comrsquo09 pp 261ndash272 Beijing China 2009

[8] L Pei R Chen J Liu et al ldquoMotion Recognition AssistedIndoor Wireless Navigation on a Mobile Phonerdquo in Pro-ceedings of the 23rd International Technical Meeting of eSatellite Division of the Institute of Navigation pp 3366ndash3375Portland OR USA September 2010

[9] P S Nagpal and R Rashidzadeh ldquoIndoor positioning usingmagnetic compass and accelerometer of smartphonesrdquo inProceedings of International Conference on Selected Topics inMobile and Wireless Networking (MoWNeT) pp 140ndash145Montreal Canada 2013

[10] J Menke and A Zakhor ldquoMulti-modal indoor positioning ofmobile devicesrdquo in Proceedings of IEEE International Con-ference on Indoor Positioning and Indoor Navigationpp 13ndash16 Alberta Canada October 2015

[11] M Oussalah M Alakhras and M I Hussein ldquoMultivariablefuzzy inference system for ngerprinting indoor localizationrdquoFuzzy Sets and Systems vol 269 pp 65ndash89 2015

[12] N Li J Chen Y Yuan X Tian Y Han and M Xia ldquoA wi-indoor localization strategy using particle swarm optimizationbased articial neural networksrdquo International Journal ofDistributed Sensor Networks vol 12 no 3 article 45831472016

002040608

112

k-N

N

4-la

yer A

NN

(SCG

)

4-la

yer A

NN

(RBP

)

3-la

yer A

NN

(SCG

)

3-la

yer A

NN

(RBP

)

2-la

yer A

NN

(SCG

)

2-la

yer A

NN

(RBP

)

Klowast

FURI

A

Dee

pLea

rnin

g4J

CEns

Loc

k-N

N

4-la

yer A

NN

(SCG

)

4-la

yer A

NN

(RBP

)

3-la

yer A

NN

(SCG

)

3-la

yer A

NN

(RBP

)

2-la

yer A

NN

(SCG

)

2-la

yer A

NN

(RBP

)

Klowast

FURI

A

Dee

pLea

rnin

g4J

CEns

Loc

10-CV Test dataset

PrecisionRecallAccuracy

Figure 5 Performance measures on both 10-CV and test dataset

Table 5 OOB loss

Average 10-fold loss OOB loss00292813 00300123

10 Mobile Information Systems

[13] C Song J Wang and G Yuan ldquoHidden naive bayes indoorfingerprinting localization based on best-discriminating APselectionrdquo ISPRS International Journal of Geo-Informationvol 5 no 10 p 189 2016

[14] P Bahl and V N Padmanabhan ldquoRADAR an in-building RFbased user location and tracking systemrdquo in Proceedings ofIEEE INFOCOM 2000 Conference on Computer Communi-cations Nineteenth Annual Joint Conference of the IEEEComputer and Communications Societies (Cat No00CH37064) vol 2 pp 775ndash784 Tel Aviv Israel March 2000

[15] M Cooper J Biehl G Filby and S Kratz ldquoLoCo boosting forindoor location classification combining Wi-Fi and BLErdquoPersonal and Ubiquitous Computing vol 20 no 1 pp 1ndash142016

[16] C Wang F Wu Z Shi and D Zhang ldquoIndoor positioningtechnique by combining RFID and particle swarmoptimization-based back propagation neural networkrdquo Optik(Stuttg) vol 127 no 17 pp 6839ndash6849 2016

[17] J Xu H Dai and W Ying ldquoMulti-layer neural network forreceived signal strength-based indoor localisationrdquo IETCommunications vol 10 no 6 pp 717ndash723 2016

[18] W Zhang K Liu W Zhang Y Zhang and J Gu ldquoDeepneural networks for wireless localization in indoor andoutdoor environmentsrdquo Neurocomputing vol 194 pp 279ndash287 2016

[19] L Calderoni M Ferrara A Franco and D Maio ldquoIndoorlocalization in a hospital environment using Random Forestclassifiersrdquo Expert Systems with Applications vol 42 no 1pp 125ndash134 2015

[20] E Jedari Z Wu R Rashidzadeh and M Saif ldquoWi-Fi basedindoor location positioning employing random forest clas-sifierrdquo in Proceedings of 2015 International Conference onIndoor Positioning and Indoor Navigation (IPIN) pp 13ndash16Banff Albeta Canada October 2015

[21] Y Mo Z Zhang Y Lu W Meng and G Agha ldquoRandomforest based coarse locating and KPCA feature extraction forindoor positioning systemrdquo Mathematical Problems in En-gineering vol 2014 Article ID 850926 8 pages 2014

[22] R Gorak and M Luckner ldquoMalfunction immune WindashFilocalisationmethodrdquo in Computational Collective Intelligencepp 328ndash337 Springer Cham Switzerland 2015

[23] R Gorak and M Luckner ldquoModified random forest algorithmfor WindashFi indoor localization systemrdquo in Proceedings of In-ternational Conference on Computational Collective Intelligencepp 147ndash157 Cham Switzerland September 2016

[24] J Niu B Wang L Cheng and J J P C Rodrigues ldquoWicLocan indoor localization system based on WiFi fingerprints andcrowdsourcingrdquo in Proceedings of 2015 IEEE InternationalConference on Communications (ICC) pp 3008ndash3013 Lon-don UK June 2015

[25] G Ding Z Tan J Zhang and L Zhang ldquoFingerprintinglocalization based on affinity propagation clustering and ar-tificial neural networksrdquo in Proceedings of IEEE WirelessCommunications and Networking Conference (WCNC)pp 2317ndash2322 Shanghai China April 2013

[26] O Leon J Hernandez-Serrano and M Soriano ldquoSecuringcognitive radio networksrdquo International Journal of Commu-nication Systems vol 23 no 5 pp 633ndash652 2010

[27] S Tuncer and T Tuncer ldquoIndoor localization with bluetoothtechnology using artificial neural networksrdquo in Proceedings ofthe IEEE 19th International Conference on Intelligent Engi-neering Systems pp 213ndash217 Bratislava Slovakia September2015

[28] B A Akram A H Akbar B Wajid O Shafiq and A ZafarldquoLocSwayamwar finding a suitable ML algorithm for wi-fifingerprinting based indoor positioning systemrdquo in LectureNotes in Electrical Engineering A Boyaci A Ekti M Aydinand S Yarkan Eds vol 504 Springer Singapore 2018

[29] L Breiman ldquoRandom forestsrdquo Machine Learning vol 45no 1 pp 5ndash32 2001

[30] K Kaji and N Kawaguchi ldquoDesign and implementation ofwifi indoor localization based on Gaussian mixture model andparticle filterrdquo in Proceedings of 2012 International Conferenceon Indoor Positioning and Indoor Navigation (IPIN) pp 1ndash9Sydney Australia November 2012

[31] C Wu Z Yang Y Liu and W Xi ldquoWILL wireless indoorlocalization without site surveyrdquo IEEE Transactions on Par-allel and Distributed Systems vol 24 no 4 pp 839ndash848 2013

[32] S Yang P Dessai M Verma and M Gerla ldquoFreeLoccalibration-free crowdsourced indoor localizationrdquo in Pro-ceedings of IEEE INFOCOM pp 2481ndash2489 Turin Italy April2013

[33] T Seidl and H-P Kriegel ldquoOptimal multi-step k-nearestneighbor searchrdquo ACM SIGMOD Record vol 27 no 2pp 154ndash165 1998

[34] W S Mcculloch and W Pitts ldquoA logical calculus nervousactivityrdquo Bulletin of Mathematical Biology vol 52 no l-2pp 99ndash115 1990

[35] J G Cleary and L E Trigg ldquoKlowast an instance-based learnerusing an entropic distance measurerdquo in Proceedings of TwelfthInternational Conference on Machine Learning vol 5 pp 1ndash14 Tahoe City CA USA July 1995

[36] J Huhn and E Hullermeier ldquoFURIA an algorithm for un-ordered fuzzy rule inductionrdquo Data Mining and KnowledgeDiscovery vol 19 no 3 pp 293ndash319 2009

[37] S Ciss Generalization Error and Out-of-bag Bounds inRandom (Uniform) Forests PhD thesis French UniversityParis France 2015

Mobile Information Systems 11

Computer Games Technology

International Journal of

Hindawiwwwhindawicom Volume 2018

Hindawiwwwhindawicom

Journal ofEngineeringVolume 2018

Advances in

FuzzySystems

Hindawiwwwhindawicom

Volume 2018

International Journal of

ReconfigurableComputing

Hindawiwwwhindawicom Volume 2018

Hindawiwwwhindawicom Volume 2018

Applied Computational Intelligence and Soft Computing

thinspAdvancesthinspinthinsp

thinspArtificial Intelligence

Hindawiwwwhindawicom Volumethinsp2018

Hindawiwwwhindawicom Volume 2018

Civil EngineeringAdvances in

Hindawiwwwhindawicom Volume 2018

Electrical and Computer Engineering

Journal of

Journal of

Computer Networks and Communications

Hindawiwwwhindawicom Volume 2018

Hindawi

wwwhindawicom Volume 2018

Advances in

Multimedia

International Journal of

Biomedical Imaging

Hindawiwwwhindawicom Volume 2018

Hindawiwwwhindawicom Volume 2018

Engineering Mathematics

International Journal of

RoboticsJournal of

Hindawiwwwhindawicom Volume 2018

Hindawiwwwhindawicom Volume 2018

Computational Intelligence and Neuroscience

Hindawiwwwhindawicom Volume 2018

Mathematical Problems in Engineering

Modelling ampSimulationin EngineeringHindawiwwwhindawicom Volume 2018

Hindawi Publishing Corporation httpwwwhindawicom Volume 2013Hindawiwwwhindawicom

The Scientific World Journal

Volume 2018

Hindawiwwwhindawicom Volume 2018

Human-ComputerInteraction

Advances in

Hindawiwwwhindawicom Volume 2018

Scientic Programming

Submit your manuscripts atwwwhindawicom

Page 8: CEnsLoc: Infrastructure-Less Indoor Localization ...downloads.hindawi.com/journals/misy/2018/3287810.pdf · 4.Proposed Localization Methodology 4.1. Problem Formulation. We assume

were obtained by CEnsLoc with 97 and 95 accuracyfollowed by FURIA (92 90) k-NN (91 89) Klowast (9087) ANN (85 82) and DeepLearning4J (73 71)respectively e results are presented on both 10-CV andunseen test data subset to show that for a small dataset 10-CV results can provide a good performance estimate asresults on the test data subset were found to be showingsimilar trends as depicted by the 10-CV results Althoughdeep leaning and ANN provide good performance resultsbut nding optimal conguration for a huge number oftunable parameters is a tricky task Moreover during ex-periments it was found that despite generic guidelines forparameter tuning the performance measures and conver-gence rate of ANN as well as deep learning schemes highlynotuctuate with even a slight variation in the number ofneurons in layers training algorithm and number of hiddenlayers which makes it harder to nd optimal congurationfor ANN and deep learning models Lazy and instance-basedapproaches such as k-NN and Klowast are also good candidates

for indoor localization however they have a very limitednumber of tunable parameters resulting in inability tosurpass CEnsLoc performance despite trying dierent pa-rameter combinations ey do not generalize well ascompared to RFE as the end result of prediction is heavilydependent on the majority vote by k closest matchedsamples in the dataset ey also need template matchingwith the entire dataset for one location prediction whichresults in growing response time with increasing numberof samples in the dataset which is highly likely in practicalreal-world scenarios As in a typical building there area quite huge number of visible APs and a suciently largenumber of samples are also required for classiers to workproperly

A minimum response time of 205Eminus 05 seconds wasobtained by FURIA DeepLearning4J stood second with682Eminus 05 seconds response time ANN k-NN and CEn-sLoc all had response time on the scale of Eminus 04 secondswhich is 10 times slower than aforementioned two

L1 L2 L3

L4

L5

L6

L7

L8

L10L9 L12

L11

L12

L12

L12

L12

Figure 3 Room labels and marked RPs

446 446717

2700 2700 2700 2700

1446

250743 743

4496

0500

100015002000250030003500400045005000

Number of samplesLocations

Num

ber o

f sam

ples

L1L2L3

L4L5L6

L7L8L9

L10L11L12

Figure 4 Number of samples per room

8 Mobile Information Systems

approaches but the difference was trifling which cannot bedetected by any human user of the system the accuracyprecision and recall provided by CEnsLoc was much greaterthan all other approaches

FURIA stands out as the second best performer re-garding indoor localization which is an upgraded version ofthe RIPPER algorithm indicating that rule-based algo-rithms specially fuzzified versions with good generalizationcapabilities perform well for location estimation as wellHowever it lagged behind CEnsLoc in terms of accuracyprecision and recall by 5 10 and 15 respectively

(e details of both 10-CV and test dataset results for allIPS compared and CEnsLoc are depicted in Figure 5 for sideby side visual comparison

52 Out-of-Bag (OOB) Error Results (ere is a performancemeasure called OOB error which is peculiar to RFE Duringtraining of RFE data subsets are generated with re-placement resulting in some repeated and left out obser-vations (OOB observations) for each tree (at particulartree does not train on these left out OOB observationsPrediction capability of RFE can be measured using ldquoOOBLossrdquo which is the error made on unseen OOB observations

during training OOB loss measure has been investigatedand shown to provide an upper bound on testing error [37]specifically useful for small-sized datasets Hence OOBerror can be used just likeinstead of unseen test data subsetif the available dataset is small or unseen test dataset isunavailable It provides a very good estimate of the trainedclassifierrsquos generalization capability Table 5 summarizes theOOB loss compared with averaged out 10-CV loss indicatingthat it indeed bounds it

6 Conclusion and Future Work

Location predictionestimation provides derivation ofmeaningful context for a broad range of services and ap-plications Indoor localization can open altogether a pleth-ora of new opportunities because humans spend most oftheir time indoors CEnsLoc offers shorter response timeand an overall improvement in accuracy precision andrecall With only a few parameters to be tuned it is suited forFP-based localization which requires frequent recollection ofdata and retraining CEnsLoc was able to attain 97 ac-curacy in comparison with other IPS averaged over 6 dif-ferent configurations namely FURIA k-NN Klowast ANN andDeepLearning4J with 92 91 90 85 and 73 ac-curacies respectively It can be utilized for elderly assistancenavigation smart buildings and smart transportation toname a few potential applications

Our future work includes deployment of CEnsLoc acrossa wide range of civil infrastructures including different floorsandor buildings to understand its performance in moredetail as well as its scalability using crowdsourcing We alsoaim to build safety security and evacuation guide appli-cations with CEnsLoc at their core for users in officesuniversities and retail Furthermore integration with GPSBluetooth and PDR to further enhance accuracy utilizing

Table 2 Tenfold cross-validated performance evaluation and comparison of CEnsLoc

Accuracy Precision Recallk-NN 091 088 087ANN 085 083 078Klowast 090 096 062FURIA 092 089 082DeepLearning4J 073 051 041CEnsLoc 097 098 097

Table 3 Tenfold cross-validated performance comparison of CEnsLoc in terms of training and response time (seconds)

Time (sec) Training time (10-fold) Avg training time (1-fold) Response time dataset Response time (1 sample)k-NN mdash mdash 197 170Eminus 04ANN 26720 2672 018 141Eminus 04Klowast mdash mdash 10309 817Eminus 02FURIA 1587 158 003 205E2 05DeepLearning4J 75559 755 009 682Eminus 05CEnsLoc 14007 14007 235 208Eminus 04

Table 4 Performance evaluation and comparison of CEnsLoc ontest subset

Accuracy Precision Recallk-NN 089 087 085ANN 082 081 076Klowast 087 094 060FURIA 090 086 079DeepLearning4J 071 048 039CEnsLoc 095 096 094

Mobile Information Systems 9

available hybrid technologies at a given time is also part ofthe plan

Data Availability

e data used to support the ndings of this study areavailable from the corresponding author upon request

Conflicts of Interest

e authors declare that they have no connoticts of interest

Acknowledgments

is research was supported by Basic Science ResearchProgram through the National Research Foundation ofKorea (NRF) funded by the Ministry of Education (NRF-2018R1D1A1B07048697)

References

[1] J Torres-Sospedra R Montoliu S Trilles O Belmonte andJ Huerta ldquoComprehensive analysis of distance and similaritymeasures for Wi-Fi ngerprinting indoor positioning sys-temsrdquo Expert Systems with Applications vol 42 no 23pp 9263ndash9278 2015

[2] H Mehmood and N K Tripathi ldquoCascading articial neuralnetworks optimized by genetic algorithms and integrated withglobal navigation satellite system to oer accurate ubiquitouspositioning in urban environmentrdquo Computers Environmentand Urban Systems vol 37 no 1 pp 35ndash44 2013

[3] M M Soltani A Motamedi and A Hammad ldquoEnhancingcluster-based RFID tag localization using articial neuralnetworks and virtual reference tagsrdquo Automation in Con-struction vol 54 pp 93ndash105 2015

[4] M L Rodrigues L F M Vieira and M F M CamposldquoFingerprinting-based radio localization in indoor environ-ments using multiple wireless technologiesrdquo in Proceedings ofIEEE International Symposium on Personal Indoor andMobile Radio Communications pp 1203ndash1207 Toronto ONCanada September 2011

[5] J Luo and H Gao ldquoDeep belief networks for ngerprintingindoor localization using ultrawideband technologyrdquo In-ternational Journal of Distributed Sensor Networks vol 12no 1 article 5840916 2016

[6] S Saeedi L Paull M Trentini and H Li ldquoNeural network-based multiple robot simultaneous localization and map-pingrdquo IEEE Transactions on Neural Networks vol 22 no 12pp 880ndash885 2011

[7] M Azizyan R R Choudhury and I Constandache ldquoSur-roundSense mobile phone localization via ambience nger-printingrdquo in Proceedings of the 15th annual internationalconference on Mobile computing and networking-Mobi-Comrsquo09 pp 261ndash272 Beijing China 2009

[8] L Pei R Chen J Liu et al ldquoMotion Recognition AssistedIndoor Wireless Navigation on a Mobile Phonerdquo in Pro-ceedings of the 23rd International Technical Meeting of eSatellite Division of the Institute of Navigation pp 3366ndash3375Portland OR USA September 2010

[9] P S Nagpal and R Rashidzadeh ldquoIndoor positioning usingmagnetic compass and accelerometer of smartphonesrdquo inProceedings of International Conference on Selected Topics inMobile and Wireless Networking (MoWNeT) pp 140ndash145Montreal Canada 2013

[10] J Menke and A Zakhor ldquoMulti-modal indoor positioning ofmobile devicesrdquo in Proceedings of IEEE International Con-ference on Indoor Positioning and Indoor Navigationpp 13ndash16 Alberta Canada October 2015

[11] M Oussalah M Alakhras and M I Hussein ldquoMultivariablefuzzy inference system for ngerprinting indoor localizationrdquoFuzzy Sets and Systems vol 269 pp 65ndash89 2015

[12] N Li J Chen Y Yuan X Tian Y Han and M Xia ldquoA wi-indoor localization strategy using particle swarm optimizationbased articial neural networksrdquo International Journal ofDistributed Sensor Networks vol 12 no 3 article 45831472016

002040608

112

k-N

N

4-la

yer A

NN

(SCG

)

4-la

yer A

NN

(RBP

)

3-la

yer A

NN

(SCG

)

3-la

yer A

NN

(RBP

)

2-la

yer A

NN

(SCG

)

2-la

yer A

NN

(RBP

)

Klowast

FURI

A

Dee

pLea

rnin

g4J

CEns

Loc

k-N

N

4-la

yer A

NN

(SCG

)

4-la

yer A

NN

(RBP

)

3-la

yer A

NN

(SCG

)

3-la

yer A

NN

(RBP

)

2-la

yer A

NN

(SCG

)

2-la

yer A

NN

(RBP

)

Klowast

FURI

A

Dee

pLea

rnin

g4J

CEns

Loc

10-CV Test dataset

PrecisionRecallAccuracy

Figure 5 Performance measures on both 10-CV and test dataset

Table 5 OOB loss

Average 10-fold loss OOB loss00292813 00300123

10 Mobile Information Systems

[13] C Song J Wang and G Yuan ldquoHidden naive bayes indoorfingerprinting localization based on best-discriminating APselectionrdquo ISPRS International Journal of Geo-Informationvol 5 no 10 p 189 2016

[14] P Bahl and V N Padmanabhan ldquoRADAR an in-building RFbased user location and tracking systemrdquo in Proceedings ofIEEE INFOCOM 2000 Conference on Computer Communi-cations Nineteenth Annual Joint Conference of the IEEEComputer and Communications Societies (Cat No00CH37064) vol 2 pp 775ndash784 Tel Aviv Israel March 2000

[15] M Cooper J Biehl G Filby and S Kratz ldquoLoCo boosting forindoor location classification combining Wi-Fi and BLErdquoPersonal and Ubiquitous Computing vol 20 no 1 pp 1ndash142016

[16] C Wang F Wu Z Shi and D Zhang ldquoIndoor positioningtechnique by combining RFID and particle swarmoptimization-based back propagation neural networkrdquo Optik(Stuttg) vol 127 no 17 pp 6839ndash6849 2016

[17] J Xu H Dai and W Ying ldquoMulti-layer neural network forreceived signal strength-based indoor localisationrdquo IETCommunications vol 10 no 6 pp 717ndash723 2016

[18] W Zhang K Liu W Zhang Y Zhang and J Gu ldquoDeepneural networks for wireless localization in indoor andoutdoor environmentsrdquo Neurocomputing vol 194 pp 279ndash287 2016

[19] L Calderoni M Ferrara A Franco and D Maio ldquoIndoorlocalization in a hospital environment using Random Forestclassifiersrdquo Expert Systems with Applications vol 42 no 1pp 125ndash134 2015

[20] E Jedari Z Wu R Rashidzadeh and M Saif ldquoWi-Fi basedindoor location positioning employing random forest clas-sifierrdquo in Proceedings of 2015 International Conference onIndoor Positioning and Indoor Navigation (IPIN) pp 13ndash16Banff Albeta Canada October 2015

[21] Y Mo Z Zhang Y Lu W Meng and G Agha ldquoRandomforest based coarse locating and KPCA feature extraction forindoor positioning systemrdquo Mathematical Problems in En-gineering vol 2014 Article ID 850926 8 pages 2014

[22] R Gorak and M Luckner ldquoMalfunction immune WindashFilocalisationmethodrdquo in Computational Collective Intelligencepp 328ndash337 Springer Cham Switzerland 2015

[23] R Gorak and M Luckner ldquoModified random forest algorithmfor WindashFi indoor localization systemrdquo in Proceedings of In-ternational Conference on Computational Collective Intelligencepp 147ndash157 Cham Switzerland September 2016

[24] J Niu B Wang L Cheng and J J P C Rodrigues ldquoWicLocan indoor localization system based on WiFi fingerprints andcrowdsourcingrdquo in Proceedings of 2015 IEEE InternationalConference on Communications (ICC) pp 3008ndash3013 Lon-don UK June 2015

[25] G Ding Z Tan J Zhang and L Zhang ldquoFingerprintinglocalization based on affinity propagation clustering and ar-tificial neural networksrdquo in Proceedings of IEEE WirelessCommunications and Networking Conference (WCNC)pp 2317ndash2322 Shanghai China April 2013

[26] O Leon J Hernandez-Serrano and M Soriano ldquoSecuringcognitive radio networksrdquo International Journal of Commu-nication Systems vol 23 no 5 pp 633ndash652 2010

[27] S Tuncer and T Tuncer ldquoIndoor localization with bluetoothtechnology using artificial neural networksrdquo in Proceedings ofthe IEEE 19th International Conference on Intelligent Engi-neering Systems pp 213ndash217 Bratislava Slovakia September2015

[28] B A Akram A H Akbar B Wajid O Shafiq and A ZafarldquoLocSwayamwar finding a suitable ML algorithm for wi-fifingerprinting based indoor positioning systemrdquo in LectureNotes in Electrical Engineering A Boyaci A Ekti M Aydinand S Yarkan Eds vol 504 Springer Singapore 2018

[29] L Breiman ldquoRandom forestsrdquo Machine Learning vol 45no 1 pp 5ndash32 2001

[30] K Kaji and N Kawaguchi ldquoDesign and implementation ofwifi indoor localization based on Gaussian mixture model andparticle filterrdquo in Proceedings of 2012 International Conferenceon Indoor Positioning and Indoor Navigation (IPIN) pp 1ndash9Sydney Australia November 2012

[31] C Wu Z Yang Y Liu and W Xi ldquoWILL wireless indoorlocalization without site surveyrdquo IEEE Transactions on Par-allel and Distributed Systems vol 24 no 4 pp 839ndash848 2013

[32] S Yang P Dessai M Verma and M Gerla ldquoFreeLoccalibration-free crowdsourced indoor localizationrdquo in Pro-ceedings of IEEE INFOCOM pp 2481ndash2489 Turin Italy April2013

[33] T Seidl and H-P Kriegel ldquoOptimal multi-step k-nearestneighbor searchrdquo ACM SIGMOD Record vol 27 no 2pp 154ndash165 1998

[34] W S Mcculloch and W Pitts ldquoA logical calculus nervousactivityrdquo Bulletin of Mathematical Biology vol 52 no l-2pp 99ndash115 1990

[35] J G Cleary and L E Trigg ldquoKlowast an instance-based learnerusing an entropic distance measurerdquo in Proceedings of TwelfthInternational Conference on Machine Learning vol 5 pp 1ndash14 Tahoe City CA USA July 1995

[36] J Huhn and E Hullermeier ldquoFURIA an algorithm for un-ordered fuzzy rule inductionrdquo Data Mining and KnowledgeDiscovery vol 19 no 3 pp 293ndash319 2009

[37] S Ciss Generalization Error and Out-of-bag Bounds inRandom (Uniform) Forests PhD thesis French UniversityParis France 2015

Mobile Information Systems 11

Computer Games Technology

International Journal of

Hindawiwwwhindawicom Volume 2018

Hindawiwwwhindawicom

Journal ofEngineeringVolume 2018

Advances in

FuzzySystems

Hindawiwwwhindawicom

Volume 2018

International Journal of

ReconfigurableComputing

Hindawiwwwhindawicom Volume 2018

Hindawiwwwhindawicom Volume 2018

Applied Computational Intelligence and Soft Computing

thinspAdvancesthinspinthinsp

thinspArtificial Intelligence

Hindawiwwwhindawicom Volumethinsp2018

Hindawiwwwhindawicom Volume 2018

Civil EngineeringAdvances in

Hindawiwwwhindawicom Volume 2018

Electrical and Computer Engineering

Journal of

Journal of

Computer Networks and Communications

Hindawiwwwhindawicom Volume 2018

Hindawi

wwwhindawicom Volume 2018

Advances in

Multimedia

International Journal of

Biomedical Imaging

Hindawiwwwhindawicom Volume 2018

Hindawiwwwhindawicom Volume 2018

Engineering Mathematics

International Journal of

RoboticsJournal of

Hindawiwwwhindawicom Volume 2018

Hindawiwwwhindawicom Volume 2018

Computational Intelligence and Neuroscience

Hindawiwwwhindawicom Volume 2018

Mathematical Problems in Engineering

Modelling ampSimulationin EngineeringHindawiwwwhindawicom Volume 2018

Hindawi Publishing Corporation httpwwwhindawicom Volume 2013Hindawiwwwhindawicom

The Scientific World Journal

Volume 2018

Hindawiwwwhindawicom Volume 2018

Human-ComputerInteraction

Advances in

Hindawiwwwhindawicom Volume 2018

Scientic Programming

Submit your manuscripts atwwwhindawicom

Page 9: CEnsLoc: Infrastructure-Less Indoor Localization ...downloads.hindawi.com/journals/misy/2018/3287810.pdf · 4.Proposed Localization Methodology 4.1. Problem Formulation. We assume

approaches but the difference was trifling which cannot bedetected by any human user of the system the accuracyprecision and recall provided by CEnsLoc was much greaterthan all other approaches

FURIA stands out as the second best performer re-garding indoor localization which is an upgraded version ofthe RIPPER algorithm indicating that rule-based algo-rithms specially fuzzified versions with good generalizationcapabilities perform well for location estimation as wellHowever it lagged behind CEnsLoc in terms of accuracyprecision and recall by 5 10 and 15 respectively

(e details of both 10-CV and test dataset results for allIPS compared and CEnsLoc are depicted in Figure 5 for sideby side visual comparison

52 Out-of-Bag (OOB) Error Results (ere is a performancemeasure called OOB error which is peculiar to RFE Duringtraining of RFE data subsets are generated with re-placement resulting in some repeated and left out obser-vations (OOB observations) for each tree (at particulartree does not train on these left out OOB observationsPrediction capability of RFE can be measured using ldquoOOBLossrdquo which is the error made on unseen OOB observations

during training OOB loss measure has been investigatedand shown to provide an upper bound on testing error [37]specifically useful for small-sized datasets Hence OOBerror can be used just likeinstead of unseen test data subsetif the available dataset is small or unseen test dataset isunavailable It provides a very good estimate of the trainedclassifierrsquos generalization capability Table 5 summarizes theOOB loss compared with averaged out 10-CV loss indicatingthat it indeed bounds it

6 Conclusion and Future Work

Location predictionestimation provides derivation ofmeaningful context for a broad range of services and ap-plications Indoor localization can open altogether a pleth-ora of new opportunities because humans spend most oftheir time indoors CEnsLoc offers shorter response timeand an overall improvement in accuracy precision andrecall With only a few parameters to be tuned it is suited forFP-based localization which requires frequent recollection ofdata and retraining CEnsLoc was able to attain 97 ac-curacy in comparison with other IPS averaged over 6 dif-ferent configurations namely FURIA k-NN Klowast ANN andDeepLearning4J with 92 91 90 85 and 73 ac-curacies respectively It can be utilized for elderly assistancenavigation smart buildings and smart transportation toname a few potential applications

Our future work includes deployment of CEnsLoc acrossa wide range of civil infrastructures including different floorsandor buildings to understand its performance in moredetail as well as its scalability using crowdsourcing We alsoaim to build safety security and evacuation guide appli-cations with CEnsLoc at their core for users in officesuniversities and retail Furthermore integration with GPSBluetooth and PDR to further enhance accuracy utilizing

Table 2 Tenfold cross-validated performance evaluation and comparison of CEnsLoc

Accuracy Precision Recallk-NN 091 088 087ANN 085 083 078Klowast 090 096 062FURIA 092 089 082DeepLearning4J 073 051 041CEnsLoc 097 098 097

Table 3 Tenfold cross-validated performance comparison of CEnsLoc in terms of training and response time (seconds)

Time (sec) Training time (10-fold) Avg training time (1-fold) Response time dataset Response time (1 sample)k-NN mdash mdash 197 170Eminus 04ANN 26720 2672 018 141Eminus 04Klowast mdash mdash 10309 817Eminus 02FURIA 1587 158 003 205E2 05DeepLearning4J 75559 755 009 682Eminus 05CEnsLoc 14007 14007 235 208Eminus 04

Table 4 Performance evaluation and comparison of CEnsLoc ontest subset

Accuracy Precision Recallk-NN 089 087 085ANN 082 081 076Klowast 087 094 060FURIA 090 086 079DeepLearning4J 071 048 039CEnsLoc 095 096 094

Mobile Information Systems 9

available hybrid technologies at a given time is also part ofthe plan

Data Availability

e data used to support the ndings of this study areavailable from the corresponding author upon request

Conflicts of Interest

e authors declare that they have no connoticts of interest

Acknowledgments

is research was supported by Basic Science ResearchProgram through the National Research Foundation ofKorea (NRF) funded by the Ministry of Education (NRF-2018R1D1A1B07048697)

References

[1] J Torres-Sospedra R Montoliu S Trilles O Belmonte andJ Huerta ldquoComprehensive analysis of distance and similaritymeasures for Wi-Fi ngerprinting indoor positioning sys-temsrdquo Expert Systems with Applications vol 42 no 23pp 9263ndash9278 2015

[2] H Mehmood and N K Tripathi ldquoCascading articial neuralnetworks optimized by genetic algorithms and integrated withglobal navigation satellite system to oer accurate ubiquitouspositioning in urban environmentrdquo Computers Environmentand Urban Systems vol 37 no 1 pp 35ndash44 2013

[3] M M Soltani A Motamedi and A Hammad ldquoEnhancingcluster-based RFID tag localization using articial neuralnetworks and virtual reference tagsrdquo Automation in Con-struction vol 54 pp 93ndash105 2015

[4] M L Rodrigues L F M Vieira and M F M CamposldquoFingerprinting-based radio localization in indoor environ-ments using multiple wireless technologiesrdquo in Proceedings ofIEEE International Symposium on Personal Indoor andMobile Radio Communications pp 1203ndash1207 Toronto ONCanada September 2011

[5] J Luo and H Gao ldquoDeep belief networks for ngerprintingindoor localization using ultrawideband technologyrdquo In-ternational Journal of Distributed Sensor Networks vol 12no 1 article 5840916 2016

[6] S Saeedi L Paull M Trentini and H Li ldquoNeural network-based multiple robot simultaneous localization and map-pingrdquo IEEE Transactions on Neural Networks vol 22 no 12pp 880ndash885 2011

[7] M Azizyan R R Choudhury and I Constandache ldquoSur-roundSense mobile phone localization via ambience nger-printingrdquo in Proceedings of the 15th annual internationalconference on Mobile computing and networking-Mobi-Comrsquo09 pp 261ndash272 Beijing China 2009

[8] L Pei R Chen J Liu et al ldquoMotion Recognition AssistedIndoor Wireless Navigation on a Mobile Phonerdquo in Pro-ceedings of the 23rd International Technical Meeting of eSatellite Division of the Institute of Navigation pp 3366ndash3375Portland OR USA September 2010

[9] P S Nagpal and R Rashidzadeh ldquoIndoor positioning usingmagnetic compass and accelerometer of smartphonesrdquo inProceedings of International Conference on Selected Topics inMobile and Wireless Networking (MoWNeT) pp 140ndash145Montreal Canada 2013

[10] J Menke and A Zakhor ldquoMulti-modal indoor positioning ofmobile devicesrdquo in Proceedings of IEEE International Con-ference on Indoor Positioning and Indoor Navigationpp 13ndash16 Alberta Canada October 2015

[11] M Oussalah M Alakhras and M I Hussein ldquoMultivariablefuzzy inference system for ngerprinting indoor localizationrdquoFuzzy Sets and Systems vol 269 pp 65ndash89 2015

[12] N Li J Chen Y Yuan X Tian Y Han and M Xia ldquoA wi-indoor localization strategy using particle swarm optimizationbased articial neural networksrdquo International Journal ofDistributed Sensor Networks vol 12 no 3 article 45831472016

002040608

112

k-N

N

4-la

yer A

NN

(SCG

)

4-la

yer A

NN

(RBP

)

3-la

yer A

NN

(SCG

)

3-la

yer A

NN

(RBP

)

2-la

yer A

NN

(SCG

)

2-la

yer A

NN

(RBP

)

Klowast

FURI

A

Dee

pLea

rnin

g4J

CEns

Loc

k-N

N

4-la

yer A

NN

(SCG

)

4-la

yer A

NN

(RBP

)

3-la

yer A

NN

(SCG

)

3-la

yer A

NN

(RBP

)

2-la

yer A

NN

(SCG

)

2-la

yer A

NN

(RBP

)

Klowast

FURI

A

Dee

pLea

rnin

g4J

CEns

Loc

10-CV Test dataset

PrecisionRecallAccuracy

Figure 5 Performance measures on both 10-CV and test dataset

Table 5 OOB loss

Average 10-fold loss OOB loss00292813 00300123

10 Mobile Information Systems

[13] C Song J Wang and G Yuan ldquoHidden naive bayes indoorfingerprinting localization based on best-discriminating APselectionrdquo ISPRS International Journal of Geo-Informationvol 5 no 10 p 189 2016

[14] P Bahl and V N Padmanabhan ldquoRADAR an in-building RFbased user location and tracking systemrdquo in Proceedings ofIEEE INFOCOM 2000 Conference on Computer Communi-cations Nineteenth Annual Joint Conference of the IEEEComputer and Communications Societies (Cat No00CH37064) vol 2 pp 775ndash784 Tel Aviv Israel March 2000

[15] M Cooper J Biehl G Filby and S Kratz ldquoLoCo boosting forindoor location classification combining Wi-Fi and BLErdquoPersonal and Ubiquitous Computing vol 20 no 1 pp 1ndash142016

[16] C Wang F Wu Z Shi and D Zhang ldquoIndoor positioningtechnique by combining RFID and particle swarmoptimization-based back propagation neural networkrdquo Optik(Stuttg) vol 127 no 17 pp 6839ndash6849 2016

[17] J Xu H Dai and W Ying ldquoMulti-layer neural network forreceived signal strength-based indoor localisationrdquo IETCommunications vol 10 no 6 pp 717ndash723 2016

[18] W Zhang K Liu W Zhang Y Zhang and J Gu ldquoDeepneural networks for wireless localization in indoor andoutdoor environmentsrdquo Neurocomputing vol 194 pp 279ndash287 2016

[19] L Calderoni M Ferrara A Franco and D Maio ldquoIndoorlocalization in a hospital environment using Random Forestclassifiersrdquo Expert Systems with Applications vol 42 no 1pp 125ndash134 2015

[20] E Jedari Z Wu R Rashidzadeh and M Saif ldquoWi-Fi basedindoor location positioning employing random forest clas-sifierrdquo in Proceedings of 2015 International Conference onIndoor Positioning and Indoor Navigation (IPIN) pp 13ndash16Banff Albeta Canada October 2015

[21] Y Mo Z Zhang Y Lu W Meng and G Agha ldquoRandomforest based coarse locating and KPCA feature extraction forindoor positioning systemrdquo Mathematical Problems in En-gineering vol 2014 Article ID 850926 8 pages 2014

[22] R Gorak and M Luckner ldquoMalfunction immune WindashFilocalisationmethodrdquo in Computational Collective Intelligencepp 328ndash337 Springer Cham Switzerland 2015

[23] R Gorak and M Luckner ldquoModified random forest algorithmfor WindashFi indoor localization systemrdquo in Proceedings of In-ternational Conference on Computational Collective Intelligencepp 147ndash157 Cham Switzerland September 2016

[24] J Niu B Wang L Cheng and J J P C Rodrigues ldquoWicLocan indoor localization system based on WiFi fingerprints andcrowdsourcingrdquo in Proceedings of 2015 IEEE InternationalConference on Communications (ICC) pp 3008ndash3013 Lon-don UK June 2015

[25] G Ding Z Tan J Zhang and L Zhang ldquoFingerprintinglocalization based on affinity propagation clustering and ar-tificial neural networksrdquo in Proceedings of IEEE WirelessCommunications and Networking Conference (WCNC)pp 2317ndash2322 Shanghai China April 2013

[26] O Leon J Hernandez-Serrano and M Soriano ldquoSecuringcognitive radio networksrdquo International Journal of Commu-nication Systems vol 23 no 5 pp 633ndash652 2010

[27] S Tuncer and T Tuncer ldquoIndoor localization with bluetoothtechnology using artificial neural networksrdquo in Proceedings ofthe IEEE 19th International Conference on Intelligent Engi-neering Systems pp 213ndash217 Bratislava Slovakia September2015

[28] B A Akram A H Akbar B Wajid O Shafiq and A ZafarldquoLocSwayamwar finding a suitable ML algorithm for wi-fifingerprinting based indoor positioning systemrdquo in LectureNotes in Electrical Engineering A Boyaci A Ekti M Aydinand S Yarkan Eds vol 504 Springer Singapore 2018

[29] L Breiman ldquoRandom forestsrdquo Machine Learning vol 45no 1 pp 5ndash32 2001

[30] K Kaji and N Kawaguchi ldquoDesign and implementation ofwifi indoor localization based on Gaussian mixture model andparticle filterrdquo in Proceedings of 2012 International Conferenceon Indoor Positioning and Indoor Navigation (IPIN) pp 1ndash9Sydney Australia November 2012

[31] C Wu Z Yang Y Liu and W Xi ldquoWILL wireless indoorlocalization without site surveyrdquo IEEE Transactions on Par-allel and Distributed Systems vol 24 no 4 pp 839ndash848 2013

[32] S Yang P Dessai M Verma and M Gerla ldquoFreeLoccalibration-free crowdsourced indoor localizationrdquo in Pro-ceedings of IEEE INFOCOM pp 2481ndash2489 Turin Italy April2013

[33] T Seidl and H-P Kriegel ldquoOptimal multi-step k-nearestneighbor searchrdquo ACM SIGMOD Record vol 27 no 2pp 154ndash165 1998

[34] W S Mcculloch and W Pitts ldquoA logical calculus nervousactivityrdquo Bulletin of Mathematical Biology vol 52 no l-2pp 99ndash115 1990

[35] J G Cleary and L E Trigg ldquoKlowast an instance-based learnerusing an entropic distance measurerdquo in Proceedings of TwelfthInternational Conference on Machine Learning vol 5 pp 1ndash14 Tahoe City CA USA July 1995

[36] J Huhn and E Hullermeier ldquoFURIA an algorithm for un-ordered fuzzy rule inductionrdquo Data Mining and KnowledgeDiscovery vol 19 no 3 pp 293ndash319 2009

[37] S Ciss Generalization Error and Out-of-bag Bounds inRandom (Uniform) Forests PhD thesis French UniversityParis France 2015

Mobile Information Systems 11

Computer Games Technology

International Journal of

Hindawiwwwhindawicom Volume 2018

Hindawiwwwhindawicom

Journal ofEngineeringVolume 2018

Advances in

FuzzySystems

Hindawiwwwhindawicom

Volume 2018

International Journal of

ReconfigurableComputing

Hindawiwwwhindawicom Volume 2018

Hindawiwwwhindawicom Volume 2018

Applied Computational Intelligence and Soft Computing

thinspAdvancesthinspinthinsp

thinspArtificial Intelligence

Hindawiwwwhindawicom Volumethinsp2018

Hindawiwwwhindawicom Volume 2018

Civil EngineeringAdvances in

Hindawiwwwhindawicom Volume 2018

Electrical and Computer Engineering

Journal of

Journal of

Computer Networks and Communications

Hindawiwwwhindawicom Volume 2018

Hindawi

wwwhindawicom Volume 2018

Advances in

Multimedia

International Journal of

Biomedical Imaging

Hindawiwwwhindawicom Volume 2018

Hindawiwwwhindawicom Volume 2018

Engineering Mathematics

International Journal of

RoboticsJournal of

Hindawiwwwhindawicom Volume 2018

Hindawiwwwhindawicom Volume 2018

Computational Intelligence and Neuroscience

Hindawiwwwhindawicom Volume 2018

Mathematical Problems in Engineering

Modelling ampSimulationin EngineeringHindawiwwwhindawicom Volume 2018

Hindawi Publishing Corporation httpwwwhindawicom Volume 2013Hindawiwwwhindawicom

The Scientific World Journal

Volume 2018

Hindawiwwwhindawicom Volume 2018

Human-ComputerInteraction

Advances in

Hindawiwwwhindawicom Volume 2018

Scientic Programming

Submit your manuscripts atwwwhindawicom

Page 10: CEnsLoc: Infrastructure-Less Indoor Localization ...downloads.hindawi.com/journals/misy/2018/3287810.pdf · 4.Proposed Localization Methodology 4.1. Problem Formulation. We assume

available hybrid technologies at a given time is also part ofthe plan

Data Availability

e data used to support the ndings of this study areavailable from the corresponding author upon request

Conflicts of Interest

e authors declare that they have no connoticts of interest

Acknowledgments

is research was supported by Basic Science ResearchProgram through the National Research Foundation ofKorea (NRF) funded by the Ministry of Education (NRF-2018R1D1A1B07048697)

References

[1] J Torres-Sospedra R Montoliu S Trilles O Belmonte andJ Huerta ldquoComprehensive analysis of distance and similaritymeasures for Wi-Fi ngerprinting indoor positioning sys-temsrdquo Expert Systems with Applications vol 42 no 23pp 9263ndash9278 2015

[2] H Mehmood and N K Tripathi ldquoCascading articial neuralnetworks optimized by genetic algorithms and integrated withglobal navigation satellite system to oer accurate ubiquitouspositioning in urban environmentrdquo Computers Environmentand Urban Systems vol 37 no 1 pp 35ndash44 2013

[3] M M Soltani A Motamedi and A Hammad ldquoEnhancingcluster-based RFID tag localization using articial neuralnetworks and virtual reference tagsrdquo Automation in Con-struction vol 54 pp 93ndash105 2015

[4] M L Rodrigues L F M Vieira and M F M CamposldquoFingerprinting-based radio localization in indoor environ-ments using multiple wireless technologiesrdquo in Proceedings ofIEEE International Symposium on Personal Indoor andMobile Radio Communications pp 1203ndash1207 Toronto ONCanada September 2011

[5] J Luo and H Gao ldquoDeep belief networks for ngerprintingindoor localization using ultrawideband technologyrdquo In-ternational Journal of Distributed Sensor Networks vol 12no 1 article 5840916 2016

[6] S Saeedi L Paull M Trentini and H Li ldquoNeural network-based multiple robot simultaneous localization and map-pingrdquo IEEE Transactions on Neural Networks vol 22 no 12pp 880ndash885 2011

[7] M Azizyan R R Choudhury and I Constandache ldquoSur-roundSense mobile phone localization via ambience nger-printingrdquo in Proceedings of the 15th annual internationalconference on Mobile computing and networking-Mobi-Comrsquo09 pp 261ndash272 Beijing China 2009

[8] L Pei R Chen J Liu et al ldquoMotion Recognition AssistedIndoor Wireless Navigation on a Mobile Phonerdquo in Pro-ceedings of the 23rd International Technical Meeting of eSatellite Division of the Institute of Navigation pp 3366ndash3375Portland OR USA September 2010

[9] P S Nagpal and R Rashidzadeh ldquoIndoor positioning usingmagnetic compass and accelerometer of smartphonesrdquo inProceedings of International Conference on Selected Topics inMobile and Wireless Networking (MoWNeT) pp 140ndash145Montreal Canada 2013

[10] J Menke and A Zakhor ldquoMulti-modal indoor positioning ofmobile devicesrdquo in Proceedings of IEEE International Con-ference on Indoor Positioning and Indoor Navigationpp 13ndash16 Alberta Canada October 2015

[11] M Oussalah M Alakhras and M I Hussein ldquoMultivariablefuzzy inference system for ngerprinting indoor localizationrdquoFuzzy Sets and Systems vol 269 pp 65ndash89 2015

[12] N Li J Chen Y Yuan X Tian Y Han and M Xia ldquoA wi-indoor localization strategy using particle swarm optimizationbased articial neural networksrdquo International Journal ofDistributed Sensor Networks vol 12 no 3 article 45831472016

002040608

112

k-N

N

4-la

yer A

NN

(SCG

)

4-la

yer A

NN

(RBP

)

3-la

yer A

NN

(SCG

)

3-la

yer A

NN

(RBP

)

2-la

yer A

NN

(SCG

)

2-la

yer A

NN

(RBP

)

Klowast

FURI

A

Dee

pLea

rnin

g4J

CEns

Loc

k-N

N

4-la

yer A

NN

(SCG

)

4-la

yer A

NN

(RBP

)

3-la

yer A

NN

(SCG

)

3-la

yer A

NN

(RBP

)

2-la

yer A

NN

(SCG

)

2-la

yer A

NN

(RBP

)

Klowast

FURI

A

Dee

pLea

rnin

g4J

CEns

Loc

10-CV Test dataset

PrecisionRecallAccuracy

Figure 5 Performance measures on both 10-CV and test dataset

Table 5 OOB loss

Average 10-fold loss OOB loss00292813 00300123

10 Mobile Information Systems

[13] C Song J Wang and G Yuan ldquoHidden naive bayes indoorfingerprinting localization based on best-discriminating APselectionrdquo ISPRS International Journal of Geo-Informationvol 5 no 10 p 189 2016

[14] P Bahl and V N Padmanabhan ldquoRADAR an in-building RFbased user location and tracking systemrdquo in Proceedings ofIEEE INFOCOM 2000 Conference on Computer Communi-cations Nineteenth Annual Joint Conference of the IEEEComputer and Communications Societies (Cat No00CH37064) vol 2 pp 775ndash784 Tel Aviv Israel March 2000

[15] M Cooper J Biehl G Filby and S Kratz ldquoLoCo boosting forindoor location classification combining Wi-Fi and BLErdquoPersonal and Ubiquitous Computing vol 20 no 1 pp 1ndash142016

[16] C Wang F Wu Z Shi and D Zhang ldquoIndoor positioningtechnique by combining RFID and particle swarmoptimization-based back propagation neural networkrdquo Optik(Stuttg) vol 127 no 17 pp 6839ndash6849 2016

[17] J Xu H Dai and W Ying ldquoMulti-layer neural network forreceived signal strength-based indoor localisationrdquo IETCommunications vol 10 no 6 pp 717ndash723 2016

[18] W Zhang K Liu W Zhang Y Zhang and J Gu ldquoDeepneural networks for wireless localization in indoor andoutdoor environmentsrdquo Neurocomputing vol 194 pp 279ndash287 2016

[19] L Calderoni M Ferrara A Franco and D Maio ldquoIndoorlocalization in a hospital environment using Random Forestclassifiersrdquo Expert Systems with Applications vol 42 no 1pp 125ndash134 2015

[20] E Jedari Z Wu R Rashidzadeh and M Saif ldquoWi-Fi basedindoor location positioning employing random forest clas-sifierrdquo in Proceedings of 2015 International Conference onIndoor Positioning and Indoor Navigation (IPIN) pp 13ndash16Banff Albeta Canada October 2015

[21] Y Mo Z Zhang Y Lu W Meng and G Agha ldquoRandomforest based coarse locating and KPCA feature extraction forindoor positioning systemrdquo Mathematical Problems in En-gineering vol 2014 Article ID 850926 8 pages 2014

[22] R Gorak and M Luckner ldquoMalfunction immune WindashFilocalisationmethodrdquo in Computational Collective Intelligencepp 328ndash337 Springer Cham Switzerland 2015

[23] R Gorak and M Luckner ldquoModified random forest algorithmfor WindashFi indoor localization systemrdquo in Proceedings of In-ternational Conference on Computational Collective Intelligencepp 147ndash157 Cham Switzerland September 2016

[24] J Niu B Wang L Cheng and J J P C Rodrigues ldquoWicLocan indoor localization system based on WiFi fingerprints andcrowdsourcingrdquo in Proceedings of 2015 IEEE InternationalConference on Communications (ICC) pp 3008ndash3013 Lon-don UK June 2015

[25] G Ding Z Tan J Zhang and L Zhang ldquoFingerprintinglocalization based on affinity propagation clustering and ar-tificial neural networksrdquo in Proceedings of IEEE WirelessCommunications and Networking Conference (WCNC)pp 2317ndash2322 Shanghai China April 2013

[26] O Leon J Hernandez-Serrano and M Soriano ldquoSecuringcognitive radio networksrdquo International Journal of Commu-nication Systems vol 23 no 5 pp 633ndash652 2010

[27] S Tuncer and T Tuncer ldquoIndoor localization with bluetoothtechnology using artificial neural networksrdquo in Proceedings ofthe IEEE 19th International Conference on Intelligent Engi-neering Systems pp 213ndash217 Bratislava Slovakia September2015

[28] B A Akram A H Akbar B Wajid O Shafiq and A ZafarldquoLocSwayamwar finding a suitable ML algorithm for wi-fifingerprinting based indoor positioning systemrdquo in LectureNotes in Electrical Engineering A Boyaci A Ekti M Aydinand S Yarkan Eds vol 504 Springer Singapore 2018

[29] L Breiman ldquoRandom forestsrdquo Machine Learning vol 45no 1 pp 5ndash32 2001

[30] K Kaji and N Kawaguchi ldquoDesign and implementation ofwifi indoor localization based on Gaussian mixture model andparticle filterrdquo in Proceedings of 2012 International Conferenceon Indoor Positioning and Indoor Navigation (IPIN) pp 1ndash9Sydney Australia November 2012

[31] C Wu Z Yang Y Liu and W Xi ldquoWILL wireless indoorlocalization without site surveyrdquo IEEE Transactions on Par-allel and Distributed Systems vol 24 no 4 pp 839ndash848 2013

[32] S Yang P Dessai M Verma and M Gerla ldquoFreeLoccalibration-free crowdsourced indoor localizationrdquo in Pro-ceedings of IEEE INFOCOM pp 2481ndash2489 Turin Italy April2013

[33] T Seidl and H-P Kriegel ldquoOptimal multi-step k-nearestneighbor searchrdquo ACM SIGMOD Record vol 27 no 2pp 154ndash165 1998

[34] W S Mcculloch and W Pitts ldquoA logical calculus nervousactivityrdquo Bulletin of Mathematical Biology vol 52 no l-2pp 99ndash115 1990

[35] J G Cleary and L E Trigg ldquoKlowast an instance-based learnerusing an entropic distance measurerdquo in Proceedings of TwelfthInternational Conference on Machine Learning vol 5 pp 1ndash14 Tahoe City CA USA July 1995

[36] J Huhn and E Hullermeier ldquoFURIA an algorithm for un-ordered fuzzy rule inductionrdquo Data Mining and KnowledgeDiscovery vol 19 no 3 pp 293ndash319 2009

[37] S Ciss Generalization Error and Out-of-bag Bounds inRandom (Uniform) Forests PhD thesis French UniversityParis France 2015

Mobile Information Systems 11

Computer Games Technology

International Journal of

Hindawiwwwhindawicom Volume 2018

Hindawiwwwhindawicom

Journal ofEngineeringVolume 2018

Advances in

FuzzySystems

Hindawiwwwhindawicom

Volume 2018

International Journal of

ReconfigurableComputing

Hindawiwwwhindawicom Volume 2018

Hindawiwwwhindawicom Volume 2018

Applied Computational Intelligence and Soft Computing

thinspAdvancesthinspinthinsp

thinspArtificial Intelligence

Hindawiwwwhindawicom Volumethinsp2018

Hindawiwwwhindawicom Volume 2018

Civil EngineeringAdvances in

Hindawiwwwhindawicom Volume 2018

Electrical and Computer Engineering

Journal of

Journal of

Computer Networks and Communications

Hindawiwwwhindawicom Volume 2018

Hindawi

wwwhindawicom Volume 2018

Advances in

Multimedia

International Journal of

Biomedical Imaging

Hindawiwwwhindawicom Volume 2018

Hindawiwwwhindawicom Volume 2018

Engineering Mathematics

International Journal of

RoboticsJournal of

Hindawiwwwhindawicom Volume 2018

Hindawiwwwhindawicom Volume 2018

Computational Intelligence and Neuroscience

Hindawiwwwhindawicom Volume 2018

Mathematical Problems in Engineering

Modelling ampSimulationin EngineeringHindawiwwwhindawicom Volume 2018

Hindawi Publishing Corporation httpwwwhindawicom Volume 2013Hindawiwwwhindawicom

The Scientific World Journal

Volume 2018

Hindawiwwwhindawicom Volume 2018

Human-ComputerInteraction

Advances in

Hindawiwwwhindawicom Volume 2018

Scientic Programming

Submit your manuscripts atwwwhindawicom

Page 11: CEnsLoc: Infrastructure-Less Indoor Localization ...downloads.hindawi.com/journals/misy/2018/3287810.pdf · 4.Proposed Localization Methodology 4.1. Problem Formulation. We assume

[13] C Song J Wang and G Yuan ldquoHidden naive bayes indoorfingerprinting localization based on best-discriminating APselectionrdquo ISPRS International Journal of Geo-Informationvol 5 no 10 p 189 2016

[14] P Bahl and V N Padmanabhan ldquoRADAR an in-building RFbased user location and tracking systemrdquo in Proceedings ofIEEE INFOCOM 2000 Conference on Computer Communi-cations Nineteenth Annual Joint Conference of the IEEEComputer and Communications Societies (Cat No00CH37064) vol 2 pp 775ndash784 Tel Aviv Israel March 2000

[15] M Cooper J Biehl G Filby and S Kratz ldquoLoCo boosting forindoor location classification combining Wi-Fi and BLErdquoPersonal and Ubiquitous Computing vol 20 no 1 pp 1ndash142016

[16] C Wang F Wu Z Shi and D Zhang ldquoIndoor positioningtechnique by combining RFID and particle swarmoptimization-based back propagation neural networkrdquo Optik(Stuttg) vol 127 no 17 pp 6839ndash6849 2016

[17] J Xu H Dai and W Ying ldquoMulti-layer neural network forreceived signal strength-based indoor localisationrdquo IETCommunications vol 10 no 6 pp 717ndash723 2016

[18] W Zhang K Liu W Zhang Y Zhang and J Gu ldquoDeepneural networks for wireless localization in indoor andoutdoor environmentsrdquo Neurocomputing vol 194 pp 279ndash287 2016

[19] L Calderoni M Ferrara A Franco and D Maio ldquoIndoorlocalization in a hospital environment using Random Forestclassifiersrdquo Expert Systems with Applications vol 42 no 1pp 125ndash134 2015

[20] E Jedari Z Wu R Rashidzadeh and M Saif ldquoWi-Fi basedindoor location positioning employing random forest clas-sifierrdquo in Proceedings of 2015 International Conference onIndoor Positioning and Indoor Navigation (IPIN) pp 13ndash16Banff Albeta Canada October 2015

[21] Y Mo Z Zhang Y Lu W Meng and G Agha ldquoRandomforest based coarse locating and KPCA feature extraction forindoor positioning systemrdquo Mathematical Problems in En-gineering vol 2014 Article ID 850926 8 pages 2014

[22] R Gorak and M Luckner ldquoMalfunction immune WindashFilocalisationmethodrdquo in Computational Collective Intelligencepp 328ndash337 Springer Cham Switzerland 2015

[23] R Gorak and M Luckner ldquoModified random forest algorithmfor WindashFi indoor localization systemrdquo in Proceedings of In-ternational Conference on Computational Collective Intelligencepp 147ndash157 Cham Switzerland September 2016

[24] J Niu B Wang L Cheng and J J P C Rodrigues ldquoWicLocan indoor localization system based on WiFi fingerprints andcrowdsourcingrdquo in Proceedings of 2015 IEEE InternationalConference on Communications (ICC) pp 3008ndash3013 Lon-don UK June 2015

[25] G Ding Z Tan J Zhang and L Zhang ldquoFingerprintinglocalization based on affinity propagation clustering and ar-tificial neural networksrdquo in Proceedings of IEEE WirelessCommunications and Networking Conference (WCNC)pp 2317ndash2322 Shanghai China April 2013

[26] O Leon J Hernandez-Serrano and M Soriano ldquoSecuringcognitive radio networksrdquo International Journal of Commu-nication Systems vol 23 no 5 pp 633ndash652 2010

[27] S Tuncer and T Tuncer ldquoIndoor localization with bluetoothtechnology using artificial neural networksrdquo in Proceedings ofthe IEEE 19th International Conference on Intelligent Engi-neering Systems pp 213ndash217 Bratislava Slovakia September2015

[28] B A Akram A H Akbar B Wajid O Shafiq and A ZafarldquoLocSwayamwar finding a suitable ML algorithm for wi-fifingerprinting based indoor positioning systemrdquo in LectureNotes in Electrical Engineering A Boyaci A Ekti M Aydinand S Yarkan Eds vol 504 Springer Singapore 2018

[29] L Breiman ldquoRandom forestsrdquo Machine Learning vol 45no 1 pp 5ndash32 2001

[30] K Kaji and N Kawaguchi ldquoDesign and implementation ofwifi indoor localization based on Gaussian mixture model andparticle filterrdquo in Proceedings of 2012 International Conferenceon Indoor Positioning and Indoor Navigation (IPIN) pp 1ndash9Sydney Australia November 2012

[31] C Wu Z Yang Y Liu and W Xi ldquoWILL wireless indoorlocalization without site surveyrdquo IEEE Transactions on Par-allel and Distributed Systems vol 24 no 4 pp 839ndash848 2013

[32] S Yang P Dessai M Verma and M Gerla ldquoFreeLoccalibration-free crowdsourced indoor localizationrdquo in Pro-ceedings of IEEE INFOCOM pp 2481ndash2489 Turin Italy April2013

[33] T Seidl and H-P Kriegel ldquoOptimal multi-step k-nearestneighbor searchrdquo ACM SIGMOD Record vol 27 no 2pp 154ndash165 1998

[34] W S Mcculloch and W Pitts ldquoA logical calculus nervousactivityrdquo Bulletin of Mathematical Biology vol 52 no l-2pp 99ndash115 1990

[35] J G Cleary and L E Trigg ldquoKlowast an instance-based learnerusing an entropic distance measurerdquo in Proceedings of TwelfthInternational Conference on Machine Learning vol 5 pp 1ndash14 Tahoe City CA USA July 1995

[36] J Huhn and E Hullermeier ldquoFURIA an algorithm for un-ordered fuzzy rule inductionrdquo Data Mining and KnowledgeDiscovery vol 19 no 3 pp 293ndash319 2009

[37] S Ciss Generalization Error and Out-of-bag Bounds inRandom (Uniform) Forests PhD thesis French UniversityParis France 2015

Mobile Information Systems 11

Computer Games Technology

International Journal of

Hindawiwwwhindawicom Volume 2018

Hindawiwwwhindawicom

Journal ofEngineeringVolume 2018

Advances in

FuzzySystems

Hindawiwwwhindawicom

Volume 2018

International Journal of

ReconfigurableComputing

Hindawiwwwhindawicom Volume 2018

Hindawiwwwhindawicom Volume 2018

Applied Computational Intelligence and Soft Computing

thinspAdvancesthinspinthinsp

thinspArtificial Intelligence

Hindawiwwwhindawicom Volumethinsp2018

Hindawiwwwhindawicom Volume 2018

Civil EngineeringAdvances in

Hindawiwwwhindawicom Volume 2018

Electrical and Computer Engineering

Journal of

Journal of

Computer Networks and Communications

Hindawiwwwhindawicom Volume 2018

Hindawi

wwwhindawicom Volume 2018

Advances in

Multimedia

International Journal of

Biomedical Imaging

Hindawiwwwhindawicom Volume 2018

Hindawiwwwhindawicom Volume 2018

Engineering Mathematics

International Journal of

RoboticsJournal of

Hindawiwwwhindawicom Volume 2018

Hindawiwwwhindawicom Volume 2018

Computational Intelligence and Neuroscience

Hindawiwwwhindawicom Volume 2018

Mathematical Problems in Engineering

Modelling ampSimulationin EngineeringHindawiwwwhindawicom Volume 2018

Hindawi Publishing Corporation httpwwwhindawicom Volume 2013Hindawiwwwhindawicom

The Scientific World Journal

Volume 2018

Hindawiwwwhindawicom Volume 2018

Human-ComputerInteraction

Advances in

Hindawiwwwhindawicom Volume 2018

Scientic Programming

Submit your manuscripts atwwwhindawicom

Page 12: CEnsLoc: Infrastructure-Less Indoor Localization ...downloads.hindawi.com/journals/misy/2018/3287810.pdf · 4.Proposed Localization Methodology 4.1. Problem Formulation. We assume

Computer Games Technology

International Journal of

Hindawiwwwhindawicom Volume 2018

Hindawiwwwhindawicom

Journal ofEngineeringVolume 2018

Advances in

FuzzySystems

Hindawiwwwhindawicom

Volume 2018

International Journal of

ReconfigurableComputing

Hindawiwwwhindawicom Volume 2018

Hindawiwwwhindawicom Volume 2018

Applied Computational Intelligence and Soft Computing

thinspAdvancesthinspinthinsp

thinspArtificial Intelligence

Hindawiwwwhindawicom Volumethinsp2018

Hindawiwwwhindawicom Volume 2018

Civil EngineeringAdvances in

Hindawiwwwhindawicom Volume 2018

Electrical and Computer Engineering

Journal of

Journal of

Computer Networks and Communications

Hindawiwwwhindawicom Volume 2018

Hindawi

wwwhindawicom Volume 2018

Advances in

Multimedia

International Journal of

Biomedical Imaging

Hindawiwwwhindawicom Volume 2018

Hindawiwwwhindawicom Volume 2018

Engineering Mathematics

International Journal of

RoboticsJournal of

Hindawiwwwhindawicom Volume 2018

Hindawiwwwhindawicom Volume 2018

Computational Intelligence and Neuroscience

Hindawiwwwhindawicom Volume 2018

Mathematical Problems in Engineering

Modelling ampSimulationin EngineeringHindawiwwwhindawicom Volume 2018

Hindawi Publishing Corporation httpwwwhindawicom Volume 2013Hindawiwwwhindawicom

The Scientific World Journal

Volume 2018

Hindawiwwwhindawicom Volume 2018

Human-ComputerInteraction

Advances in

Hindawiwwwhindawicom Volume 2018

Scientic Programming

Submit your manuscripts atwwwhindawicom