7
Passive Anonymous Mobile Positioning Data for Tourism Statistics Margus Tiru 1 , Rein Ahas 2 1 Positium LBS, Estonia, [email protected] 2 Department of Geography, University of Tartu, Estonia, [email protected] Introduction The 21 st century began in many regions with the rising spatial mobility of both individuals and goods, and with opening borders. The rising spatial mobility of individuals is connected to the diversity of such motivations as leisure, shopping, work and business, and social networks. This new societal mobility is empowered by rising cross-border information flows, ICT use and the expanding meaning of virtual and mental travel (Sheller & Urry 2006). Due to the diversity of different forms of travel and the motivations behind travelling, there is a lack of border statistics and data concerning individuals’ space-time mobility in many countries and regions. The objective of this study is to introduce a methodology for generating tourism statistics from passive mobile positioning data. The authors use the mobile operator’s Call Detail Records (CDR) for the purposes of this study, as an example of the flow of inbound foreign visitors. The questions of our research focus on the following issues: 1) What kind of tourism statistics can be generated using CDR? 2) How to define those new statistics and relate them to existing statistics? 3) What kind of data processing steps are necessary for generation such statistics? 4) How to evaluate the quality of mobile positioning based on tourism related statistics? The study of phone movements in and between mobile networks enables measuring the flows of people in and between countries, which is why mobile positioning has become useful in different fields of statistics. Mobile telephones are widespread in most countries and they can be used for collecting data for different purposes. For example, mobile data has been used for studying transportation and urban development (Asakura & Hato 2004; Calabrese et al. 2007;

Passive Anonymous Mobile Positioning Data for Tourism Statistics

Embed Size (px)

Citation preview

Page 1: Passive Anonymous Mobile Positioning Data for Tourism Statistics

Passive Anonymous Mobile Positioning Data for Tourism Statistics

Margus Tiru1, Rein Ahas

2

1Positium LBS, Estonia, [email protected]

2Department of Geography, University of Tartu, Estonia, [email protected]

Introduction

The 21st century began in many regions with the rising spatial mobility of both individuals

and goods, and with opening borders. The rising spatial mobility of individuals is connected

to the diversity of such motivations as leisure, shopping, work and business, and social

networks. This new societal mobility is empowered by rising cross-border information flows,

ICT use and the expanding meaning of virtual and mental travel (Sheller & Urry 2006). Due

to the diversity of different forms of travel and the motivations behind travelling, there is a

lack of border statistics and data concerning individuals’ space-time mobility in many

countries and regions.

The objective of this study is to introduce a methodology for generating tourism statistics

from passive mobile positioning data. The authors use the mobile operator’s Call Detail

Records (CDR) for the purposes of this study, as an example of the flow of inbound foreign

visitors. The questions of our research focus on the following issues:

1) What kind of tourism statistics can be generated using CDR?

2) How to define those new statistics and relate them to existing statistics?

3) What kind of data processing steps are necessary for generation such statistics?

4) How to evaluate the quality of mobile positioning based on tourism related statistics?

The study of phone movements in and between mobile networks enables measuring the flows

of people in and between countries, which is why mobile positioning has become useful in

different fields of statistics. Mobile telephones are widespread in most countries and they can

be used for collecting data for different purposes. For example, mobile data has been used for

studying transportation and urban development (Asakura & Hato 2004; Calabrese et al. 2007;

Page 2: Passive Anonymous Mobile Positioning Data for Tourism Statistics

Shoval 2007; Ahas et al 2010), tourism (Shoval & Isaacson 2006; Ahas et al. 2008; Tiru et al.

2010), migration (Silm & Ahas 2010), and emergency management (Bengtsson et al. 2011).

Our experience with mobile positioning based statistics and geographical analyses in Estonia

go back to 2002 and we have been collecting approximately 45% of incoming tourism related

data in Estonia since 2004 in cooperation with the largest mobile operator, EMT. Eesti Pank

(Estonian Central Bank) has been using mobile positioning statistics generated by Positium

LBS since 2008 (Positium 2009). The current paper describes the statistics developed in this

framework in Estonia.

The production of tourism statistics is internationally coordinated by several methodological

frameworks, such as the United Nations’ International Recommendations for Tourism

Statistics 2008 (UN 2008) and the Eurostat 1996 document “Applying the Eurostat

methodological guidelines in basic tourism and travel statistics” (Eurostat 2006). The

European initiatives are connected to the Council Directive 95/57/EC issued on 23 November

1995 regarding the collection of statistical information in the field of tourism (Council

Directive 95/57/EC), and Regulation (EU) 692/2011 issued by the European Parliament

concerning tourism related statistics in Europe (Regulation (EU) 692/2011). Our methodology

is composed based on the terminology and principles used in those documents.

Like any data, using mobile positioning has several limitations, for instance difficulty

accessing the data, sampling issues connected to the different data formats used in mobile

networks and differences in phone use in different cultures. These aspects will be discussed

later in the paper.

Privacy and data protection issues are extremely important in this matter and have to be

strictly followed according to the regulations and ethical principles. We point out here the

requirements specified in the EU directives for processing personal data (Directive 95/46/EC)

and the protection of privacy in the electronic communications sector (Directive 2002/58/EC).

The phone numbers used for the purposes of our methodology in Estonia were made

anonymous in the mobile operator’s system by a methodology developed by Positium LBS

(Positium 2009). The main principle followed was to keep the identity of all of the

respondents unknown and impossible to decode. The current paper does not focus on privacy

and data protection issues as these provide material for an entire separate paper.

The authors would like to thank EMT and Eesti Pank for their support in the form of the

provided data and their experiences with this topic. We would also like to thank all the

Page 3: Passive Anonymous Mobile Positioning Data for Tourism Statistics

anonymous mobile subscribers, whose data was used in this study. The methodological and

theoretical development of mobile positioning based research in the University of Tartu has

been supported by the Estonian Information Technology Foundation (EITSA), the Target

Funding Project no. SF0180052s07 of the Estonian Ministry of Education, the EU Regional

Development Foundation, project TERIKVANT 3.2.0802.11-0043 of the Environmental

Conservation & Technology R&D Program and Research grant no. ETF7562 of the Estonian

Science Foundation.

2. Defining mobile positioning based statistics

2.1. Mobile positioning based data sources in the mobile operators’ system

Passive mobile positioning data refers to location data that is automatically recorded in the

mobile operators’ memory files as the locations of the telephones or as the network’s call

activities (Ahas & Mark 2005; Ahas et al. 2008). Passive mobile positioning data can be

collected by means of various methods from the mobile operator’s core network. The most

common method is to collect Call Detail Record (CDR) information from an invoice database

or a data warehouse. It is possible to collect CDR information in real-time from data

mediation services or to store real-time data from a radio access network (e.g. A-bis

interface).

Figure 1. Sources for passive mobile positioning data in the operators’ system.

Page 4: Passive Anonymous Mobile Positioning Data for Tourism Statistics

The data formats and description levels of different passive mobile positioning data are

diverse. We evaluated 3 major sources of data for the purposes of this paper.

Table1. Comparison of different data sources.

Type of data Individual or

aggregated

Accessibility Major features

of the statistics

Adequacy for

tourism

statistics

Erlang Aggregated Easy, standard

outlet from the

operator’s

system

Phone use

intensity in the

antenna

Low

CDR Individual Privacy

problem,

software

development

needed

Call activities,

personal

features from

the operator

High

MPS Individual Easy, contract

from respondent

needed

Positioning

frequencies

determined by

the researcher;

questionnaire

possible with the

respondent

High

A-BIS probe-

based

Individual Privacy

problem,

software

development

needed

Call activities

and handover

logs; personal

features from

the operator

High

Anonymous

Bulk Location

Data

Individual Privacy

problem,

software

development

needed

Call activities

and handover

logs; personal

features from

the operator

High

Active mobile positioning (tracking) data refers to collecting phone location data according to

specially initiated queries. A number of systems have been developed for positioning

individual phones, such as friend-finders, sport-trackers, car-trackers etc. All the

Page 5: Passive Anonymous Mobile Positioning Data for Tourism Statistics

aforementioned systems can be used for tracing individual phones to acquire tourism related

data. The listed data collection and storage systems can be found from such open sources as

Google, First Location Bank etc. In most cases the data collection initiatives are limited by the

pool of persons or the timeframe, or the data might not be suitable for statistical purposes.

2.2. Defining statistical units for CDR based statistics

Call Activity – any active use (incoming or outgoing voice, text, internet or services) of a

mobile phone generated from the data warehouse or billing databases.

Handover between cells – data that enters from the mobile station or base station subsystem

and is concerned with the movement logs of the phones within the network.

The country of origin or nationality of visitors - is determined here on the basis of the

country the mobile phone is registered in. I.e. a phone registered in Estonia may be used by a

person of any nationality. Regardless, the registration of a mobile phone indicates the place,

where the person spends most of their time or is strongly connected to.

Random ID – a non-identifiable but certain phone number is always given the same ID by

the operator.

Visitor – a unique person (mobile phone user) that has travelled to another country and has

performed call activities there.

Visit – a unique visit to another country by an individual. One visit is normally composed of

two trips: one to the destination and a second back home from that destination. One person

can make a number of visits. We use the visits that have been made by Estonians directly

from Estonia to Finland and by Finns directly from Finland to Estonia, without transit or a

stop in a third country.

Trip – a unique one way trip to another country by an individual, for example from Estonia to

Finland or from Finland to Estonia.

Number of days – the duration of one visit in days.

Number of nights – the duration of one visit in nights. The formula for calculating the

number of nights in one visit is: nights = days – 1.

Page 6: Passive Anonymous Mobile Positioning Data for Tourism Statistics

We have used different segments for the visits/visitors based on the duration of the visit, the

number of visits per year and the number of days spent in another country per year.

Visits. We have divided the visits based on the length of the stay. 1) Transit visits – visits to

another country (Finland/Estonia) for a short period of time (may be <3...<12 hours) to leave

to a third country on the same day. 2) Visits to the destination are divided based on the

duration of the visit in days: one day visits, 2–4 day visits, 5 or more day visits.

Visitors. Based on the number of visits to another country, we divide the visitors as follows:

one time visitors, 2–4 time visitors, 5 or more time visitors. Based on the total number of days

spent in another country, the visitors are divided into 1 day visitors, 2–30 day visitors, 31+

day visitors and 183+ day visitors. According to the common definition (WTO), a visitor who

stays in another country for 183 or more days is considered a foreign labourer.

Geographically the analysis of Estonia has been divided according to the territory of city of

Tallinn, Harju County and Estonia as a whole. Finnish visitors in Estonia are studied

according to call activity locations. The home and work district locations of Estonian phone

users are measured using the anchor point model (Ahas et al. 2010).

3. Conclusions

Using mobile positioning data in scientific research has several positive aspects as speed of

data collection, digital format of data, large sample and high penetration of phones in most of

societies. Because of a lack of border statistics in today’s world mobile positioning is often

easiest way to collect statistical data about travel. Mobile data has also several shortcomings

that we have to keep in mind when interpreting the results. One of the weaknesses of such

quantitative statistical data is that we do not know the exact motivations and relations lying

behind those visits. The most important question is related to sampling: Who have phones?

Are they using phones during travels? How often do visitors use phones in a foreign country?

As roaming calls are expensive, it is likely that wealthy tourists and businessmen use their

phones more often than less active people with a lower income (children, students,

pensioners). This means that sampling issues are also related to lower income and age groups.

Calling is also connected with cultural differences, such as calling regulations and traditions.

Another problem that arises in case of using mobile positioning data is its quantitative

structure – we know the locations of calls (dots), but we do not know who is really making the

calls, what kind of visit he/she is on, and what kind of transportation he/she is using. The huge

Page 7: Passive Anonymous Mobile Positioning Data for Tourism Statistics

amount of quantitative data also poses a problem for data processing and cleaning; the

databases are too large to enable using traditional software and data preparation options.

4. References

Ahas, R. Aasa, A., Silm, S., Tiru, M. 2010. Daily rhythms of suburban commuters’

movements in the Tallinn metropolitan area: case study with mobile positioning data.

Transportation Research C, 18: 45–54.

Ahas, R. Aasa, A., Roose, A., Mark, Ü., Silm, S. 2008. Evaluating passive mobile positioning

data for tourism surveys: An Estonian case study. Tourism Management 29(3): 469–486.

Ahas, R., Aasa, A., Mark, Ü., Pae, T., Kull, T. 2007. Seasonal tourism spaces in Estonia: case

study with mobile positioning data. Tourism Management 28(3): 898–910.

Ahas, R., Mark, Ü. 2005. Location based services – new challenges for planning and public

administration? Futures, 37(6): 547-561.

Asakura Y, Hato E, 2004, "Tracking survey for individual travel behaviour using mobile

communication instruments" Transportation Research Part C 12 273-291

Positium LBS, 2009. Mobile phone based study of tourism statistics, feasibility study report

for Central Bank of Estonia, Tartu, manuscript.

Reades J, Calabrese F, Sevtsuk A, Ratti C, 2007, "Cellular Census: Explorations in Urban

Data Collection" IEEE Pervasive Computing 6(3) 30-38

Sheller, M., Urry, J., 2006. The new mobilities paradigm. Environment and Planning A 36,

207-226.

Shoval N, 2007, "Sensing Human Society" Environment and Planning B 34 191-195Silm,S.,

Ahas, R., 2010. 'The seasonal variability of population in Estonian municipalities,

Environmnet and Planning A, 42(10) 2527-2546.

Shoval, N., Isaacson, M. (2007b) Sequence alignment as a method for human activity analysis

in space and time. Annals of the Association of American Geographers, 97, 2, 282-297.

Tiru, M., Saluveer E., Ahas, R., Aasa, A. 2010. Web-based monitoring tool for assessing

space-time mobility of tourists using mobile positioning data: Positium Barometer. Journal of

Urban Technology, 17(1): 71-89.

Tiru, M., Kuusik, A., Lamp, M-L., Ahas, R. 2010. LBS in marketing and tourism

management: measuring destination loyalty with mobile positioning data. Journal of Location

Based Services, 4(2): 120-140.