Smart Itinerary Recommendation based on User-Generated GPS Trajectories

15th CTI Workshop, July 26, 2008

1

Smart Itinerary Recommendation Smart Itinerary Recommendation based on User-Generated GPS based on User-Generated GPS

TrajectoriesTrajectories

Hyoseok YoonHyoseok Yoon11, , Y. ZhengY. Zheng22, X. Xie, X. Xie22 and W. Woo and W. Woo11

11GIST U-VR Lab.GIST U-VR Lab.22Microsoft Research AsiaMicrosoft Research Asia

TraveliTravelingng

• Popular leisure activityPopular leisure activityHow to How to use time use time wisely?wisely?

Trial-and-Trial-and-error is error is COSTLY!!!COSTLY!!!

<Source: Flickr, Photo By Wolfgang Staudt>

Commercial Commercial SolutionSolution

• Handful itinerariesHandful itineraries– Major location– Fixed time

• Not flexibleNot flexible

<Source: Flickr, Photo By Andrew. O>

Social SolutionSocial Solution

• Ask residents of the region

• Refer to travel experts

• Learn from the experienced

<Source: Flickr, Photo By Supermariolxpt>

IntroductionIntroduction

• Data mining of GPS trajectories– User-generated– Travel routes– Travel experiences

• Itinerary recommendation

Related WorkRelated Work

• Itinerary Recommendation– Interactive system for manually generate

itinerary• INTRIGUE, TripTip

– Travel recommendation system based on online travel info. (Huang and Bian)

– Advanced Traveler Information System based on the shortest distance

• GPS Data Mining Applications– Finding patterns in GPS trajectory– Find locations of interest– GeoLife: mine user similarity, interest locations,

and travel sequences

ContributionsContributions

• Build Location-Interest Graph– From multiple user-generated GPS trajectories– For modeling travel routes

• Define a good itinerary– How to define and model itinerary– How it can be evaluated

• Smart itinerary recommendation framework– Recommend highly efficient and balanced itinerary

• Evaluation– Using a large GPS dataset – Simulated/real user queries

PreliminariesPreliminaries

• Trajectory: a sequence of time-stamped points

• Stay Point: a geographical region s– Where a user stayed over a time

threshold within a distance threshold


• Location History: A sequence of stay points user visited

• Locations: Clusters of stay points detected from multiple users’ trajectories– Substitute a stay point in with the

Location ID the stay point pertains to

Location

ss

s

s

s

s

s

s

ss


• Typical Stay Time: Defined as median of stay time of stay points in li

• Typical Time Interval (∆Ti,j): Traveling time between location li to lj

Location

ss

s

s

s

s

s

s

ss

Location

ss

s

ss

s

Location

s s

ss

s

s


• Location Interest– The interest of a location is represented by

authority scores (HITS-based inference model)*– User Experience as Hub– Locations as Authority

*Zheng, Y., Zhang, L., Xie, X., Ma, W.Y.: Mining Correlation Between Locations Using Human Location History, In: GIS 2009, pp. 472-475 (2009)


• Trip: A sequence of locations with corresponding typical time intervals

• Itinerary: A recommended trip based on user query Q

• User Query: A user-specified input (start point, end point and duration)

Modeling ItineraryModeling Itinerary

• Duration as the constraint– Duration that exceeds user’s

requirement• No use to users

– Simplifies algorithmic complexity• Provides a stopping condition

• First three factors to find candidate trips– (1) Elapsed Time Ratio– (2) Stay Time Ratio– (3) Interest Density Ratio

• Classical travel sequence to differentiate candidates further– (4) Classical Travel

Sequence Ratio

Modeling ItineraryModeling Itinerary

ArchitectureArchitecture

• Offline– Analyze

collected GPS trajectories

– Build a Location-Interest Graph (Gr)

• Online– Use Gr to

recommend an itinerary based on user query

Location-Interest GraphLocation-Interest Graph

• Location-Interest Graph– (1) Detect stay points– (2) Cluster them into locations– (3) Calculate location interest– (4) Compute classical travel sequence*

• We build Gr offline which contains info. on– Location itself

• interest, typical staying time

– Relationship between locations• Typical traveling time, classical travel sequence

*Zheng, Y., Zhang, L., Xie, X., Ma, W.Y.: Mining Interesting Locations and Travel Sequences from GPS Trajectories. In: WWW 2009, pp. 791-800 (2009)

Query VerificationQuery Verification

• In the online process, user query Q needs to be verified by calculating Dist(qs,qd)– (1) Using GPS coordinates

• Harversine formula or the spherical law of cosines

– (2) Use Web service such as Bing Map

• If the query is reasonable– Substitute start point and the end point with

the nearest locations in Gr

– Send an updated query Q` = {ls,ld,qt} to recommender

Trip Candidate SelectionTrip Candidate Selection

• Select trip candidates from the starting location ls to the end location ld.

• Candidate trips do not exceed the given duration qt.

– (1) start by adding ls to the trip

– (2) Add next feasible location not in the trip– (3) Update time parameter– (4) Repeat until the end location is reached

or no more location can be added

Trip Candidate RankingTrip Candidate Ranking

• Top-k trips in the order of the Euclidean Distance of (Elapsed Time Ratio, Stay Time Ratio, Interest Density Ratio)

Re-ranking by Travel SequenceRe-ranking by Travel Sequence

• Differentiate candidates further with classical travel sequence to consider– Authority score of going in and out and

the hub scores

• Re-rank with CTSR

Illustrative Example

1H

2H

1H

1.5H

1H

1H 30

M

30M

40M

ExperimentsExperiments

• Settings– GPS trajectories collected from 125

users• 17,745 GPS trajectories (May. 2007 ~ Aug.

2009 in Beijing)

– Time threshold Tr (20 min), distance threshold Dr (200 meters)

– 35,319 stay points are detected excluding work/home spots

– Density-based clustering algorithm OPTICS to result in 119 location


• Two evaluation approach

• (1) Simulated user queries– Algorithmic level comparison– Compare quality with baselines

• (2) User study with local residents– How user’s perceived quality of

itineraries compare by different methods


• Simulation– Four different levels for duration (5, 10,15, 20

hours)– For each level, 1,000 queries are generated

• User Study– 10 active residents of Beijing (avg: 3.8 years)– Submitted 3 queries and score 3 itineraries

generated by our method and two baselines (3x3).

Evaluation (Baselines)Evaluation (Baselines)

• Ranking-by-Time (RbT)– Recommend an itinerary with the

highest elapsed time usage

• Ranking-by-Interest (RbI)– Ranks the candidates in the order of

total interest of locations included in the itinerary

ResultsResults

• In 5hr level,– All three produce

similar quality results

– There are not many candidates and they would overlap anyway

ResultsResults

• In 10hr-20hr level– Baseline algorithms

only perform well in one aspect

– Our algorithm produces well-balanced and classical sequence is considered

ResultsResults




ResultsResults




ResultsResults

• In 5hr level,– All three produce

similar quality results– There are not many

candidates and they would overlap anyway




ResultsResults

• How does our method compare to RbT in terms of perceived time use?

• How does our method compare to RbI in terms of perceived interest?

• No significant advantage from RbT in perceived time or RbI in perceived interest Our method is well balanced and competitive

ConclusionConclusion

• Based on user-generated GPS trajectories– Build Location-Interest Graph– Model and define good itinerary

• Recommend itinerary based on user query– Find candidates and rank considering three factors

(Elapsed time, stay time and interest density)– Re-rank with classical travel sequence

• Evaluated with real and simulated user query

• Future Work– Personalized recommendation using user

preference

Context-Aware Mobile Augmented Reality Context-Aware Mobile Augmented Reality 15th CTI Workshop, July 26, 2008

• GIST U-VR Lab, Gwangju 500-712, Korea• E-Mail: [email protected]• Web: http://wiki.uvr.gist.ac.kr/Main/HyoseokYoon

Discussions and More information

Documents

Smart Itinerary Recommendation based on User-Generated GPS Trajectories