Upload
tns-nipo
View
3.889
Download
0
Tags:
Embed Size (px)
Citation preview
GPS Data Collection
Harry TimmermansEindhoven University of Technology
04/12/2023
The Survey Method
• Conventional survey methods for activity-travel diary data
• Application of new data collection methods– GPS logger (original traces)– User participation
• Social demographic information• Personal profile• Downloading and uploading data• Validating activity-trip agendas
– Web-based prompt recall survey • Embedded in TraceAnnotator
The Prompt Recall
Validation of Activities/Trips
Survey Management
• Time horizon– 4 waves, each wave takes 3 months– Each individual is invited for 3 months continuously
• Location– Rijnmond and Eindhoven regions
• Respondents– People living in area– Companies recruit their own panels
• Statistics followed will use the data from Rijnmond region as an example
User Participation (# of days)
0~7 8~14 15~31 32~60 60~0%
10%
20%
30%
40%
50%
60%
70%
19%
6%11%
5%
59%
User participation: Rijmond area
Number of days
Perc
enta
ge o
f the
par
ticip
ation
• 300 of 434 respondents are fully or partly involved in the survey
~16yr 17~30yr 31~55yr 56~65yr 66~yr0%
5%
10%
15%
20%
25%
30%
35%
40%
45%
0%
10%
41%
26%
23%
Age
Age of Respondents
The percentage of respondents who are older than 55 is almost 50%.
No children
Frequency of Activities/Trips
Missing days
High frequency is due to the short events, which needs to be filtered.
Single activity
Frequency of Activity Type
Ave. Activity Duration by Type
Frequency of Transport Mode
Many short walking trips
Approach
• Classification of transport modes and activity episode– Bayesian Belief Network (BBN)
• Replaces ad hoc rules
• A graphical representation of probabilistic causal information incorporating sets of probability conditional tables;
• Represents the interrelationship between spatial and temporal factors (input), and activity-travel pattern (output), i.e. transportation modes and activity episode;
• Learning-based improved accuracy if consistent evidence is obtained over time from more samples;
Framework
04/12/2023 Feng&Timmermans 13
• Transportation mode• Activity episode
Personal Data
GPS data
Geographical Data
Conditional Probabilities
Theoretical support and applications
• Accuracy of the algorithm– Limited sample and transportation modes– Full sample and full transportation modes
• Comparison of different imputation algorithms• Improve the imputed activity/trip sequence• Map matching between GPS traces and road networks• Impact of equity of travel time uncertainty
Accuracy of the Algorithm
Source: Anastasia, et al., (2010) Semi-Automatic Imputation of Activity-Travel Diaries Using GPS Traces, Prompted Recall and Context-Sensitive Learning Algorithms. Journal of Transportation Research Record, 2183.
Accuracy of the Algorithm Activity Walking Running Cycling Bus Motorcycle Car Train Metro Tram Light rail
Activity 84% 4% 0% 0% 0% 0% 1% 9% 2% 0% 0%Walking 2% 97% 0% 0% 1% 0% 0% 0% 0% 0% 0%Running 0% 0% 98% 0% 1% 0% 1% 0% 0% 0% 0%Cycling 0% 0% 0% 100% 0% 0% 0% 0% 0% 0% 0%Bus 1% 0% 0% 0% 87% 0% 0% 0% 0% 12% 0%Motorcycle 0% 0% 0% 0% 0% 100% 0% 0% 0% 0% 0%Car 0% 0% 0% 0% 1% 0% 98% 0% 0% 0% 1%Train 0% 0% 0% 0% 0% 0% 5% 58% 36% 0% 0%Metro 1% 0% 0% 0% 0% 0% 0% 1% 98% 0% 0%Tram 0% 0% 0% 0% 0% 0% 2% 0% 0% 98% 0%Light rail 0% 0% 0% 0% 2% 0% 0% 0% 0% 0% 98%
GPS OnlyActivity 84%Walking 97%Running 98%Cycling 100%Bus 87%Motorcycle 100%Car 98%Train 58%Metro 98%Tram 98%Light rail 98%
Source: Feng, T and Timmermans, H. (2012) Recognition of transportation mode using GPS and accelerometer data. International Conference of IATBR, Toronto, Canada, 15-20, July, 2012.
Comparison of Imputation AlgorithmsId Algorithms
1 Bayesian Network (BN)2 Naive Bayesian (NB)3 Logistic regression (LR)4 Multilayer Perception (MP)5 Decision Table (DT)6 Support Vector Machine (SVM)7 C4.5 (C45)8 CART (CART)
AlgorithmsTraining Data Test Data
CCI (%) ICI (%) Kappa CCI (%) ICI (%) KappaBN 99.805 0.195 0.997 99.474 0.526 0.993NB 86.966 13.034 0.822 86.648 13.352 0.818LR 94.865 5.135 0.926 94.510 5.490 0.921MP 97.118 2.882 0.958 96.816 3.184 0.954DT 98.886 1.114 0.984 98.100 1.900 0.973SVM 94.667 5.333 0.923 94.458 5.542 0.920C45 99.825 0.175 0.998 99.309 0.691 0.990
Table 3 Prediction accuracy and model performance
• Training data and test data• We use the indicators of the correctly classified
instances (CCI), incorrectly classified instances (ICI) and Kappa value (Kappa).
• Data are for each time epoch
- WCTRS 2013
Count Percentage
Training data 39,942 75%Test data 13,316 25%Total 53,258 100%
Training and test datasets
Comparison of Imputation Algorithms
Table 4 Hit ratios by transportation mode and activity episode Note: A-Activity episode; B-Train; C-Walking; D-Bike; E-Car; F-Bus; G-Motorbike; H-Running; I-Tram; J-Metro
• BN and C45 may perform more stable than others• The hit ratios for the test data do not have to be lower than that for the
training data, except the BN and C45.
• The level of the hit ratio of BN model is comparable with other methods.
Training Data A B C D E F G H I J
BN 0.997 0.997 0.999 1 0.999 0.999 1 0.999 1 1NB 0.848 0.969 0.934 0.799 0.836 0.926 0.949 0.98 1 0.983LR 0.989 0.991 0.818 0.928 0.891 0.758 0.947 0.76 1 1MP 0.998 0.974 0.916 0.926 0.965 0.743 0.989 0.985 1 1DT 0.999 0.971 0.958 0.985 0.979 0.99 0.991 0.974 0.982 0.98SVM 0.987 0.999 0.76 0.925 0.876 0.888 0.971 0.654 1 1C45 1 0.999 0.993 0.997 0.997 0.994 0.998 0.999 0.996 0.99
Test Data A B C D E F G H I J
BN 0.996 0.993 0.988 0.997 0.994 0.977 0.999 1 1 0.983NB 0.849 0.964 0.942 0.789 0.826 0.9 0.946 0.963 1 0.975
LR 0.99 0.994 0.815 0.915 0.882 0.733 0.935 0.752 1 1
MP 0.998 0.976 0.896 0.926 0.962 0.708 0.987 0.974 1 1
DT 0.998 0.948 0.939 0.973 0.97 0.973 0.982 0.963 0.892 0.959
SVM 0.987 0.998 0.763 0.931 0.869 0.844 0.968 0.641 0.985 1
C45 0.998 0.998 0.974 0.992 0.987 0.98 0.991 0.956 1 0.992
Superimposing the activity/trip sequence
L1 = L4HOME
L2Work
L3Shop
Sport
Trip 2
Trip 3
Trip 4
1
2
3
Trip 1
Trip 5
Trip 6
L5Restaurant • Method 1
o The frequency of the transportation mode which has the highest probability is identified for each trip episode separately. The transportation mode which has the highest frequency for all trips is selected.
• Method 2o The frequencies of all transportation modes of all
trip episodes which belong to the same tour are put together. Then, the one which has the highest frequency with highest probabilities is selected to replace others.
• Method 3o In case of three or more trips within a same tour,
we identify the transportation mode using Method 1 for all trips excluding the first and the last trips. Then, we use the confirmed mode as the replacement of the first and last trips.
- NTTS2013
Morning peak Evening peak
Original imputed 60,50% 71,1%
Method 1 65,8% 76,3%
Method 2 76,3% 65,4%
Method 3 63,2% 68,4%
• Hit ratios of car mode (AM vs. PM) BIKE BUS CAR METRO TRAIN TRAM WALKINGOriginal BIKE 4,3% - 6,4% 4,8% - 5,6% 20,9%
BUS 4,3% - 34,6% 9,5% - - 21,3%CAR 4,3% 42,9% 2,3% 6,3% 57,1% - 24,4%METRO - - 0,5% 27,0% - - 2,2%RUNNING 48,9% - 0,3% - - - 12,5%TRAIN - 4,8% 42,7% 34,9% 28,6% - 17,2%TRAM - 47,6% 1,8% - - 79,6% 0,9%WALKING 38,3% 4,8% 11,5% 17,5% 14,3% 14,8% 0,6%
Method 1 BIKE 34,0% - 2,8% 4,8% - 1,9% 14,1%BUS 4,3% 4,8% 22,6% 9,5% - - 9,7%CAR - 28,6% 26,2% 11,1% 85,7% - 44,4%METRO - 4,8% 0,8% 23,8% - - 1,9%RUNNING 34,0% - 0,3% - - - 5,3%TRAIN - 9,5% 28,8% 33,3% - - 14,4%TRAM - 38,1% 1,3% - - 72,2% 3,4%WALKING 27,7% 14,3% 17,3% 17,5% 14,3% 25,9% 6,9%
Method 2 BIKE 19,1% - 3,1% 4,8% - 1,9% 15,0%BUS 4,3% - 19,6% 9,5% - - 7,8%CAR 2,1% 33,3% 26,7% 11,1% 71,4% - 44,1%METRO - - 0,8% 20,6% - - 1,6%RUNNING 34,0% - 0,3% - - - 6,3%TRAIN - 9,5% 31,6% 36,5% 14,3% - 14,4%TRAM - 47,6% 2,0% - - 77,8% 2,5%WALKING 40,4% 9,5% 16,0% 17,5% 14,3% 20,4% 8,4%
Method 3 BIKE 17,0% - 4,8% 4,8% - 1,9% 13,8%BUS 4,3% - 23,2% 9,5% - - 14,4%CAR 2,1% 38,1% 13,7% 6,3% 57,1% - 29,7%METRO - - 1,3% 27,0% - - 1,6%RUNNING 29,8% - 0,3% - - 5,6% 10,6%TRAIN - 9,5% 34,4% 36,5% 28,6% - 16,3%TRAM - 38,1% 0,8% - - 68,5% 2,5%WALKING 46,8% 14,3% 21,6% 15,9% 14,3% 24,1% 11,3%
Total 100,0% 100,0% 100,0% 100,0% 100,0% 100,0% 100,0%
• Confusion matrix of original imputed data and new methods
• The confusion matrix shows that the suggested algorithm could substantially improve the accuracy of the imputation;
• As shown in the hit ratio, all improved methods lead to increased accuracy for morning peak trips relative to originally imputed data;
• Method 1 is better than the other two methods, especially for the prediction of motorized commute trips during peak times.
Feedbacks from Respondents
• Problems during the survey– Problems of using BT747
• Different windows system (64b system)• Internet browser (Firefox sometimes has problems)• Can’t download data (complex reasons)• Can’t upload data (wrong data file or data format)
– Problems of website• Small bugs of website program (improved)• Multiple persons in a same household (user account specific)• Long processing time (Not cleaning data)
– Missing days• Forget GPS logger or problematic data (view as a schedule)
Other Issues
• Enough number of respondents• Monitor and remind respondents• Completeness of personal profile data (social
demography)• Post data processing
Thanks for your attention.