Upload
marcelo-dias
View
220
Download
2
Embed Size (px)
Citation preview
Towards self-characterization of user mobility
patternsMathias Boc, Anne Fladenmuller, and Marcelo Dias de Amorim
Laboratoire d’Informatique de Paris 6 (LIP6/CNRS)
Universite Pierre et Marie Curie – Paris 6
104, avenue du President Kennedy, 75016, Paris, France
Emails: {boc,fladenmu,amorim}@rp.lip6.fr
Abstract—In this paper, we investigate the individuality ofmobility patterns in wireless networks composed of IEEE 802.11access points. We propose a mobility-aware clustering algorithmthat uses roaming events as the metric to evaluate the proximity toaccess-points (APs) without using any geographical information.The contributions of this clustering algorithm are threefold.First, it provides a sanitized image of the topological mobilityof individuals. Second, it categorizes clusters as belonging toagglomerations (where individuals pauses) or to paths betweenagglomerations. Third, it proposes a classification of user mobilitywith regard to the number of places with social meanings forthe individual according to the number of visited clusters. Weanalyze data collected within periods ranging up to 8 monthsand show that the differences in activeness, coverage of mobility,home locality, and size of the list of guest locations are clearlyindividual-related. Such results serve as a basis for the definitionof future user-centric communication systems.
I. INTRODUCTION AND RELATED WORK
Wireless networks are a response to the needs of mobility
of individuals. For such networks to be efficient in providing
high-quality services, it is fundamental that the networked
infrastructure understands the reality of user displacements.
Such an investigation is important for achieving three main
goals: (1) to understand what are the implications and how
mobility can have an impact on the network services, (2) to
create realistic mobility models to evaluate the performance
of protocols and algorithms, and (3) to propose new commu-
nication solutions that can adapt to the specificities of each
user. This latter point is fundamental in the wireless context
where resources are scarce.
We can classify the study of user mobility into three
categories: network-centric, AP-centric, and user-centric.
Network-centric studies are mainly focused on the dynamics
of the population with regard to the usage of network re-
sources [1], [2]. In other words, the idea is to recognize
hotspots in order to increase the density of access points (APs)
in the area. AP-centric studies address localized solutions
mainly on traffic engineering, quality of service (QoS), as well
as resource management [3]. User-centric approaches focus on
complementary support such as mobility modelling [4], [5] and
network cache management [6].
This work has been partially supported by the European Commissionproject WIP under contract 27402 and by the RNRT project Airnet undercontract 01205.
In this paper, we focus on user-centric solutions and more
specifically on how they behave in terms of mobility. We
refer to mobility as the variation of the association of nodes
to the infrastructure through APs. We will see later that,
contrary to traditional approaches that consider mobility as a
simple sequence of APs, our proposal inherently captures some
constraints of the environment without relying on geographic
information.
Understanding and classifying user mobility with a user-
centric approach requires an analysis of wireless data traces.
However, the results obtained from raw wireless data traces
are difficult to interpret due to the variations of wireless links
quality and to the geometric characteristics of the topology.
Indeed, it is difficult to make a correspondence from an
observed topological mobility with the real physical mobility
as the variations of signal quality can give different association
sequences. In the literature, the solutions to overcome these
obstacles are to approximate the real user location by using
building segmentations, to correlate the association sequence
with GPS information from sample users, or by using the
coordinates of the APs [3], [7]. However, such approxima-
tions introduce some distortions in the observed topological
mobility and thus tamper with the real impact on the net-
work services (e.g., network address attribution, sub network
organization, location management, and forwarding). In fact,
although several works contributed to a better understanding
mobility in the wireless context, they generally considered
users as an ensemble. Much has to be done toward a better
understanding of the individual contributions of each node to
the global behavior of the network.
In this paper, we propose a clustering algorithm that, without
using any geographical information, returns a sanitized image
of the observed mobility on an individual basis. We rely
on measurement campaigns conducted at Dartmouth campus
that are available to the research community. Through a
gathering process of APs, we better understand how micro-
mobility can impact the services offered by the network. From
these sanitized data, we are able to differentiate APs which
might have a social meaning for an individual (we call this
a place)1 from APs which can be interpreted as part of a
path between two places. Finally, thanks to this distinction
between APs and to a long-term study of the evolution of
1A similar definition, called “hubs”, has been introduced in [8].
individual displacements, we can distinguish different types
of mobility and thus propose a classification of individual
mobility. The resulting classification clearly underlines the fact
that users behave so differently from each other that future
communication infrastructures for wireless networks should
consider the possibility of running on a per-user basis. Our
results are in concordance with recent studies in the domain [2]
and offer further understanding on network dynamics from a
perspective point of view.
The remainder of this paper is organized as follows. In Sec-
tion II, we describe the data traces we used in our experiments.
In Section III, we enumerate the different phenomena that
introduce variations in the observed topological mobility with
the physical mobility. We detail our individual mobility-based
clustering algorithm and the resulting added values in the com-
prehension of individual mobility in Section IV. In Section V,
we analyze through different period of time individual-related
mobility characteristics. Finally, we conclude this paper in
Section VI.
II. DATA TRACES
For our analyses, we use data traces of the IEEE 802.11
wireless network of the Dartmouth campus [9]. The campus is
composed of 188 buildings, covered by 566 official wireless
access points. The total surface is about 200 acres and the
number of users is about 5500.
Our study focuses on the movement files [10] on a two-
months period (2004-01-01 to 2004-29-02). We use this period
of two months for three reasons: (1) to avoid noise due to
network events and school holidays, (2) to capture both short
and long-range mobility patterns, and (3) to observe specific
user behaviors according to how they are interested by the
network. This collection of data represents association and dis-
association events, obtained from Syslog data on anonymized
wireless cards.
III. UNDERSTANDING INDIVIDUAL MOBILITY
The comprehension of individual mobility becomes a tough
problem when it relies on raw measurement data. Indeed, the
wireless nature of the network with all the variations and
consequences it implies, plus the density of the APs in the
environment, creates variations in the observed topological
mobility.
We can cite at least four types of events which can cause
these variations:
• Ping-pong effect. It refers to the succession of
associations-disassociations between two ore more APs.
It is caused by the closeness of the signal to noise ratio
(SNR) of neighboring APs and/or the aggressiveness of
the wireless card, and can be misinterpreted as mobility.
• Localized network problems. If for some technical rea-
sons, one AP becomes disabled for a certain amount of
time, the user probably associates with another neighbor-
ing AP. Without geographical information, this event can
make us suppose that the user changed his/her habits.
• Physical micro-variations. The physical mobility of the
individual can present some small variations (about a few
meters). In topological dense areas, these micro-variations
can result in different association patterns.
• Erroneous reproducibility. There is a probability that the
same physical movement results in different association
sequences. Without geographical information like build-
ings position or GPS data, it is difficult to correlate these
physical movements with a topological view.
Concerning the latter point, the repetition of movements
can create junctures between those different patterns and
create sectors of micro-mobility which can be detected with a
topological standpoint. The need of an algorithm to recognize
these junctures is then required to better understand the real
objectives of the observed movements. This is part of our
proposal detailed in the following.
IV. INDIVIDUAL MOBILITY-BASED CLUSTERING
We propose a clustering algorithm based on individual
mobility. By using the number of roaming events (without
disconnection) between APs as the main metric, we regroup,
for each user, APs that are close to each other in terms of
probability of being visited.
A. Description of the algorithm
The algorithm is decoupled in three parts: the collection of
network events, the clustering, and the categorization of the
resulting clusters into places with social meaning.
1) Collection of network events: The collection of network
events is assured by two data structures. For each individual,
the relationship with the N APs in the network are represented
through the roaming matrix R = N × N , where the element
rij ∈ R counts the number of cumulated roaming events
between the APs i and j. The cumulated roaming events imply
that the relationship between two subsets of APs can appear
after independent sessions.
The second data structure is also created in a per-user basis.
It is an N -row table that stores general information about each
AP, such as the total number of associations, the cumulated
association duration, and the average association duration.
2) The clustering algorithm: Firstly, we have to define the
terms which will be used in the algorithm:
• Link. There is a “link” lij between two APs i and j if
there are bi-directional roaming events between these two
APs (rij 6= 0 and rji 6= 0).
• Cost of a link. The cost of a link or “distance” between
two APs i and j is equal to rij + rji. We thus define the
cost of a link lij as cij = cji = rij + rji.
• Cluster: A “cluster” is an AP or a group of APs. One
AP can be in only one cluster. Two APs are eligible to
be merged in the same cluster if it exists a link between
them.
• Weight of a cluster. The weight of a cluster is the value
of the maximum cost link of the cluster.
We start with a graph representation of the network in which
vertices are APs. An edge exists between two vertices if there
is a link (as defined above) between the corresponding APs.
In order to limit the variations of intra-cluster link costs, we
define a threshold k (with 0 ≤ k ≤ 1).
A
B
C
E
G
D
F
H
50
10
10
2
1160
8033
Cluster A
Weight A=50
Cluster D
Weight D=10
Cluster H
Weight H=80
Cluster E
Weight E=0
19
21
Fig. 1. Clustering of the access-points according to the number of roamingevents between them (with k = 0.5).
At the beginning, each AP i becomes a cluster ci of size 1
and weight wi = 0. We consider first the link with the highest
value in the graph. If two APs i (cluster ci) and j (cluster
cj) can merge together (i.e., cij ≥ k × max{wi, wj}), then
the weight of the resulting cluster will be equal to the highest
value of the links within the cluster.
We repeat the clustering process until there are no more
links to be considered. An example of a resulting clustered
graph representation is illustrated in Fig. 1. One can notice
that the cost of links from AP d in cluster cd and AP f or AP
g in cluster ch is not sufficient to merge both clusters. Such
an approach serves to differentiate paths from locations where
a user stays longer.
3) Categorization of the clusters: The result of this clus-
tering algorithm is an embedded graph composed by a list of
clusters. However, this is a flat representation of associations
because there is no temporal aspect that gives us the means
to categorize the clusters. We introduce this temporal aspect
by computing, for each cluster, the sum of the cumulated
association duration of each AP within the cluster. As we can
observe in Fig. 2, for a mobile user u1, there is a limited
number of clusters in which u1 spent most of her/his time
(around 10 APs). Similarly, there is a large number of clusters
for which the cumulated association duration does not reach
a certain threshold (100 seconds on the figure).
In order to define a threshold which depends on the mobility
of an individual, we propose to take the overall average asso-
ciation duration among all the visited APs. For each cluster
ci, we choose the AP with the highest average association
duration to represent the cluster. If this value is greater than the
threshold, the cluster will be considered as a place with social
meaning. In the other case, the cluster will be considered as
a part of a path or, simply, as a place without social meaning.
V. CHARACTERISATION OF INDIVIDUAL-RELATED
MOBILITY PATTERN
In this section we enumerate individual-related mobility
characteristics and study the evolution of these parameters
through different periods of time. We still use the initial
period of January to February 2004 and, for certain parameters,
1
10
100
1000
10000
100000
1e+06
1 10 100 1000
Cum
ula
ted a
ssoci
atio
n d
ura
tion (
sec)
Rank (num)
After clusteringBefore clustering
Fig. 2. Log-log of the cumulated associations’ durations with and withoutusing the clustering algorithm.
extend this period to eight months (September 2003 to April
2004). We provide the same analysis for the same group of
users which were active during the initial period in order to
see the evolution of their behaviors.
A. Periods of activity and mobility coverage
The periods of activity and the association durations are
clearly individual-related. These periods are important to un-
derstand the pattern of network utilization of an individual.
Individuals that are active all the time may not show the
same network behavior than individuals that are active only at
nights of weekdays. Of course, these two different behaviors
introduce different requirements. The mobility coverage -
number of different visited clusters during the period- is also
an individual-related parameter which depends on the user’s
knowledge of the network connectivity. The correlation be-
tween periods of activity and mobility coverage is individual-
related and is difficult to predict.
We compute, per day, the number of visited APs for a user
u1 for the period from September 2003 to April 2004. We
chose this specific user because of her/his regularity in the
utilization of the network during the different observed period.
Then we compared it with the behavior of a group of users.
We observe that users are more active on weekdays than
on weekend but the average number of visited APs among all
users remains stable (see Fig. 3(b)). During school holidays,
there are more variations in the visited APs. This is partly
due to the smaller number of users present in the network
and the different usages of the network. The period of activity
within weekdays can slightly change (the day of activity can be
different) but, even if there are great variations in the coverage
of mobility, the number of visited APs remains high (between
15 and 75).
B. Local Micro-Mobility and paths
The association durations within clusters reflect individual-
related mobility characteristic. This is also a means to classify
users’ mobility. Fig. 4 is a log-log view of the association
durations computed per week for an individual u2. We chose
TABLE I
COMPARISON BETWEEN THE OBSERVED PERIODS. JANUARY TO FEBRUARY 2004 IS THE REFERENCE PERIOD.
Observed period Sept-Oct 03 Oct-Nov 03 Nov-Dec 03 Dec 03-Jan 04 Jan-Fev 04 Fev-March 04 March-Apr 04
Total Nb of active users -1463 -1941 -1748 -1131 5514 -1075 -1565
Total Nb of active APs 546 545 535 532 537 544 550
Nb home locations (AP basis) 75,8% 69,5% 68,5% 74% 76,7% 70,4% 70,1%
Nb Home locations (cluster basis) 91% 88% 87% 89% 90% 87% 86%
0
10
20
30
40
50
60
70
80
0 14 28 42 56 70 84 98 112 126 140 154 168 182 196 210 224 238
Num
ber
of
dif
fere
nt
vis
ited A
cces-
poin
ts
Day (num)
(a) Number of different visited APs by the user u1 per day from 2003-01-09 to 2004-30-04.
0
2
4
6
8
10
12
14
16
18
20
0 14 28 42 56 70 84 98 112 126 140 154 168 182 196 210 224 238
Num
ber
of
dif
fere
nt
vis
ited A
cces-
poin
ts
Day (num)
AverageStandard deviation
(b) Average and standard deviation of the number of different visitedAPs by our group of users per day from 2003-01-09 to 2004-30-04.
Fig. 3. Periods of activity and number of visited APs
1
10
100
1000
10000
100000
1 10 100 1000
Cu
mu
late
d a
sso
cia
tio
n d
ura
tio
n
Sorted access-points by cumulated association duration
week1week2week3week4week5week6week7week8
Fig. 4. Log-log of the cumulated association duration among the clusters ofu2 per week between the 2004-01-01 and 2004-29-02.
u2 because she/he does show a change in her/his mobility
behavior through the observed period.
There are two important parameters in this figure which
can help classifying the mobility patterns: number of visited
clusters and number of places with social meaning. The ratio
between the number of places and the total number of visited
clusters give us an information on the intensity of the mobility.
The closer to 1 the ratio is, the less mobile an individual is.
In the same way, the number of places gives us information
on the regularity of the mobility. If this number is equal to 1,
the individual is stationary.
By combining these information we can propose four
classes of mobility:
• Stationary: 1 visited cluster. The individual is static or can
present an undergo micro-mobility (ping-pong effect).
• Occasional: 1 place with social meaning but a ratio low-
ers than 1. The individual mobility presents prevalence
for one cluster and a number of associations in other
clusters under the average associations’ duration.
• Regular: ratio included between r and 1 (with 0 < r < 1),
and number of places with social meanings higher than 1.
The individual stays regularly associated to a few number
of places and has a limited ratio.
• Intense: ratio included between 0 and r, and number of
places with social meanings higher than 1. The mobility
of this individual is complex and could be difficult to
predict.
In order to differentiate regular users from intense ones, we
have set the ratio r to 0.2. With this value, regular users have
at least 20% of places with social meanings in their list of
visited clusters.
This classification is close in essence to the one proposed by
Balazinska and Castro through a system of prevalence (ratio
of time staying associated on one AP). By computing the
maximum prevalence and the median prevalence, they are also
able to classify the different mobility patterns. Nevertheless,
we use here different metrics (places with social meanings
compared to the number of visited clusters).
0
500
1000
1500
2000
2500
3000
3500
0 14 28 42 56 70 84 98 112 126 140 154 168 182 196 210 224 238
Num
ber
of
acti
ve
wir
eles
s ca
rds
Day (num)
Fig. 5. Number of active users per day during the observed period.
C. Home location and guest locations
In our context, we define as a possible “home location”
the cluster with the highest association duration and “guest
locations” the other clusters considered as places with social
meaning (the home location is a special case of guest location).
A home location is an AP or a group of collocated APs
where a user spends at least 50% of his/her total time in the
network [1], [2], [7]. With an algorithm based on one AP we
observe that 76% of users get a home location against 90%
with the clustering algorithm (see Table I). This result is in
the same order as the results in obtained in [7] (up to 95%
in the same environment); it is important to underline that the
results shown in [7] were computed thanks to a scheme based
on the geographic coordinates of the APs. Home location
is definitely an individual-related characteristic because most
home locations do not correspond with network hotspots [7]. If
we can not distinguish a home location for a user, it is because
of the association durations on the other guest locations or a
change of home location. A change in home location denotes
a deep modification of the user behavior while the number of
guest locations denotes the regularity of the mobility.
D. Cyclic mobility patterns
Users show cyclic mobility patterns. We observe that a great
majority of users is present during weekdays of school periods.
The regularity of the number of individuals associated on
weekdays and on weekends denotes of this cyclic pattern (see
Fig 5). However, the activity during school holidays presents
some great variations. This aspect can introduce complexity in
the way to manage the network. If we correlate the information
of weeks of activity of the individual u1 in Fig 3(a) and the
number of active users in Fig 5, we see that u1 (days 56 to
70 and 84 to 98) was not associated in the network while
the general behavior shows a constant number of associations.
Thus, it is the sum of all these different users activity patterns
which gives the cyclic aspect observed on a network scale.
VI. CONCLUSION AND FUTURE WORK
In this paper we proposed an individual mobility-based
clustering algorithm which uses roaming events as a metric
to evaluate the proximity of APs. Thanks to this algorithm,
which does not use any geographical information, we can
differentiate network places with social importance for each
individual and APs which are as part of a path between two
destinations. We have analyzed the evolution of individual-
related mobility characteristics within an 8-months period.
We showed that the difference in activeness, the coverage of
mobility but also the existence of a home location, and the
number of guest locations are clearly individual-related. As a
consequence, defining fixed parameters among all the users
seems to be inefficient to classify all users’ behaviors.
An immediate conclusion of our study is that an individual-
centric approaches should be used toward efficient com-
munication systems. Finally, thanks to the clustering algorithm
and the analysis of the mobility behaviors, we have been able
to classify user mobility into four categories: intense, regular,
stationary, and occasional.
Compared to others works which generally considered users
as an ensemble, this work falls under an effort to analyze the
contribution of each individual to the general dynamic of the
network. We are interested, as future work, to study the impact
of clusters on network services and how each individual could
contribute to improve network services.
REFERENCES
[1] D. Schwab and R. Bunt, “Characterising the use of a campus wirelessnetwork,” in Proc. of the 23rd Annual Joint Conference of the IEEEComputer and Communications Societies (INFOCOM), Hong Kong,China, 2004, pp. 862–870.
[2] M. Balazinska and P. Castro, “Characterizing mobility and networkusage in a corporate wireless local-area network,” in Proc. of MobiSys2003, San Francisco, CA, May 2003, pp. 303–316.
[3] M. Kim and D. Kotz, “Classifying the mobility of users and the pop-ularity of access points,” in Proceedings of the International Workshopon Location- and Context-Awareness (LoCA), ser. Lecture Notes inComputer Science, T. Strang and C. Linnhoff-Popien, Eds., vol. 3479.Germany: Springer-Verlag, May 2005, pp. 198–209.
[4] W. jen Hsu and A. Helmy, “On modeling user associations in wirelesslan traces on university campuses,” in Proc. of the Second Workshop onWireless Network Measurements (WiNMee 2006), Boston, MA, USA,2006.
[5] M. Kim, D. Kotz, and S. Kim, “Extracting a mobility model fromreal user traces,” in Proc. of the 25th Annual Joint Conference of theIEEE Computer and Communications Societies (INFOCOM), Barcelona,Spain, 2006.
[6] F. Chinchilla, M. Lindsey, and M. Papadopouli, “Analysis of wirelessinformation locality and association patterns in a campus,” in Proc.of the 23rd Annual Joint Conference of the IEEE Computer and
Communications Societies (INFOCOM), Hong Kong, China, 2004, pp.906–917.
[7] T. Henderson, D. Kotz, and I. Abyzov, “The changing usage of a maturecampus-wide wireless network,” in Proc. of the Tenth Annual Interna-tional Conference on Mobile Computing and Networking (MobiCom),Philadelphia, PA, USA, 2004, pp. 187–201.
[8] J. Ghosh, M. J. Beal, H. Q. Ngo, and C. Qiao, “On profiling mobilityand predicting locations of campus-wide wireless users,” in Proc. ofthe second international workshop on Multi-hop ad hoc networks: from
theory to reality, Florence, Italy.[9] CRAWDAD, “A community resource for archiving wireless data at
dartmouth.” [Online]. Available: http://crawdad.cs.dartmouth.edu[10] D. Kotz, T. Henderson, and I. Abyzov, “CRAWDAD trace
set dartmouth/campus/movement (v. 2005-03-08),” March 2005.[Online]. Available: http://crawdad.cs.dartmouth.edu/dartmouth/campus/movement/