Comparison of Model Based and Machine Learning Approaches ... · 1 Comparison of Model Based and Machine Learning Approaches for 2 Bus Arrival Time Prediction 3 4 Vivek Kumar 5 Under

Comparison of Model Based and Machine Learning Approaches for 1

Bus Arrival Time Prediction 2 3 Vivek Kumar 4 Under Graduate student 5 Department of Civil Engineering, 6 Indian Institute of Technology Kharagpur 7 Kharagpur 721 302 8 INDIA 9 E-mail: [email protected] 10

B. Anil Kumar 11 Graduate student 12 Department of Civil Engineering, 13 Indian Institute of Technology Madras 14 Chennai 600 306 15 INDIA 16 E-mail: [email protected] 17

Lelitha Vanajakshi1

E-mail:

18 Associate Professor 19 Department of Civil Engineering 20 Indian Institute of Technology Madras 21 Chennai 600 036 22 INDIA 23 Ph: 91 44 2257 4291, Fax: 91 44 2257 4252 24

[email protected] 25

and 26 27 Shankar C. Subramanian 28 Associate Professor 29 Department of Engineering Design 30 Indian Institute of Technology Madras 31 Chennai 600 036 32 INDIA 33 Ph: 91 44 2257 4705 34 E-mail: [email protected] 35 36 Paper Submitted for presentation and publication in 37 Transportation Research Record, Transportation Research Board, 38 National Research Council, Washington, D. C. 39 40 Word Count: Body text (3946) + Figures (1500) + Tables (750) = 6196. 41 Submitted on: August 1, 2013 42

1 Corresponding author

TRB 2014 Annual Meeting Paper revised from original submittal.

Vivek, Anil, Lelitha and Shankar 1


Bus Arrival Time Prediction 2

ABSTRACT 3 The provision of accurate bus arrival information is critical to encourage more people to use 4 public transport and alleviate traffic congestion. Developing a prediction scheme for bus travel 5 times can provide such information. Prediction schemes can be data driven or may use a 6 mathematical model that is usually less data intensive. This paper compares the performance of 7 two methods – one being the data driven Artificial Neural Network (ANN) method and the other 8 being the model based Kalman filter method, with regards to predicting bus travel time. The 9 performances of both methods are evaluated using data collected from the field. It was found that 10 the ANN based method performed slightly better in the presence of a large database but the 11 Kalman filter method will be more advantageous when such a database is not available. 12




Bus Arrival Time Prediction 2

INTRODUCTION 3 In recent times, traffic congestion has been increasing in Indian cities due to rapid changes in 4 urbanization, economy levels, vehicle ownership and population growth that has lead to several 5 negative impacts such as delays, pollution, etc. To overcome these problems, there is a need to 6 provide more facilities by infrastructure expansion such as Bus Rapid Transit Systems (BRTS) 7 (Ahmedabad), and Metro Rail Systems (Delhi, Hyderabad, Chennai, Bangalore). However, 8 infrastructure expansion alone cannot meet the vehicular growth, and hence, there is a need to 9 explore better traffic operations and management systems. One of the major challenges of traffic 10 management in most of the developing countries such as India is due to heterogeneous traffic, 11 which comprises both motorized and non-motorized vehicles with diverse vehicular 12 characteristics. The motorized or fast moving vehicles include passenger cars, buses, trucks, auto 13 rickshaws and motor cycles, whereas the non-motorized or slow moving vehicles include 14 bicycles, cycle rickshaws and animal drawn carts. The use of Intelligent Transportation Systems 15 (ITS) for operation and management of traffic is a better option that is gaining interest in recent 16 years. Attracting more travelers towards public transportation system is one way to reduce 17 congestion, which comes under Advanced Public Transportation Systems (APTS). APTS is one 18 of the functional areas of ITS that can help to attract more people towards public transport. 19 Predicting accurate bus travel times and providing reliable information to passengers is a popular 20 APTS application. However, the information provided to passengers should be reliable; 21 otherwise customers may reject the system due to lack of reliability(1). The reliability of such 22 information being provided depends on the prediction technique and the input data used for the 23 same. 24

There are many studies on prediction of travel time using various techniques such as 25 historical and real-time approaches, statistical techniques, machine learning techniques and 26 model-based techniques. However, there are only limited studies under heterogeneous traffic 27 conditions such as the one existing in India. Machine learning techniques such as Artificial 28 Neural Networks (ANN) and Support Vector Machines (SVM) are commonly used to predict 29 travel time because of their ability to solve complex non-linear relationships. ANN has proved to 30 be one of the most effective tools for pattern recognition across different sets of problems. 31 Considering anomalies in datasets, which are true for travel time across a particular stretch of 32 road, ANN seems to be a suitable candidate for prediction. This study attempts the short term 33 Bus Travel Time Prediction (BTTP) by developing a neural network model taking most 34 correlated previous trips of same day and same week in to account. In this study, a particular 35 section of the road is divided into different subsections and the network is trained separately for 36 each subsection. The disadvantage is that these types of techniques need a large amount of data 37 to train the system. 38

Model-based techniques, on the other hand, will use models that capture the dynamics of 39 the system by establishing mathematical relationship between variables. In this study, equations 40 that can characterize the evolution of travel time over space are used. Many model-based studies 41 use the Kalman Filtering Technique (KFT) for estimation. Advantages of this approach are it's 42 limited data requirement and suitability for real time implementation. In this study, a model-43 based approach that uses data from just two previous buses will be compared with the ANN 44 technique. 45



Thus, the objective of the present study is to predict the travel time of buses under Indian 1 traffic conditions by using ANN and a model-based approach using KFT by providing the 2 appropriate inputs to the models. The study compares the performance of a data driven approach 3 with a model based method that requires only minimal data for bus travel time/arrival time 4 prediction. 5

LITERATURE REVIEW 6 Since the present study focuses on the use of ANN and model based approaches for the bus 7 arrival prediction, some of the previous works in this area are reviewed. Jeong et al. (2) reported 8 a bus travel time prediction model based on ANN taking into account arrival time, dwell time 9 and schedule adherence. Wang et al.(3) used support vector regression taking departure time 10 from the stops as inputs to reflect the traffic conditions. The developed model also used historical 11 bus travel time data, parameters of traffic conditions along bus route, and route specific 12 parameters to predict future bus travel time. Liu et al.(4) developed a state space neural network 13 model for bus arrival time prediction. Chien et al. (5) used two ANN models, link based and stop 14 based, for bus arrival prediction. Park et al. (6) predicted link travel time by spectral basis 15 artificial neural network (SNN) for one through five time period ahead for same vehicle. 16

Dailey (7) used a combination of Automatic Vehicle Location (AVL) and historic 17 database to predict travel time by using KFT and statistical analysis. Cathey (8)used bus travel 18 time data as inputs to predict the same by using KFT that involved three components namely 19 tracker, filter and predictor. Shalaby (9) used a combination of AVL and Automatic Passenger 20 Counter (APC) data to predict travel time by using KFT. Nanthawichit et al. (10) used a 21 combination of Global Positioning System (GPS) equipped probe vehicles and loop detectors 22 data to estimate travel time by using KFT. The performance of the proposed methods were 23 compared with historical data based, regression and ANN models separately. 24

All the studies discussed above dealt with homogeneous traffic conditions. Only a limited 25 number of studies were reported under heterogeneous traffic conditions. Rama Krishna et al.(11) 26 used 25 trips of bus travel time to develop ANN and Multiple Linear Regression (MLR) models. 27 Vanajakshi et al. (12) used preceding two bus trips data collected by using GPS to predict next 28 bus travel time by using KFT. Padmanabhan et al. (13)extended the above study by 29 incorporating the delays in the model. Kumar(14) proposed a statistical methodology to find out 30 patterns in the data and used them as input to predict the next bus travel time by using KFT. The 31 present study will follow the above study to identify the travel time data most suited as inputs 32 and use them to develop an ANN model to predict the bus arrival time. The performance of such 33 a data driven technique will be compared with a model based approach with lower data 34 requirement (using previous two buses data). 35

DATA COLLECTION, EXTRACTION AND ANALYSIS 36 For the purpose of collecting data, GPS equipped Metropolitan Transport Corporation (MTC) 37 buses in the city of Chennai, India, were used. The test bed chosen for the present study is an 38 MTC bus route, 5C, which connects the Parry’s bus depot in the northern part of the city to the 39 Taramani bus depot in the southern part of the city. It has a route length of 15km with varying 40 land use. Figure 1 illustrates the study route with bus stop details and distances between the bus 41 stops are tabulated in Table 1. 42



Table 1 Distance between Bus Stops 1

S.No Bus stop name Distance between bus stops (km)

Cumulative distance from the initial bus stop (km)

1 Parry’s Corner 0.00 0.00 2 Central Railway Station 0.93 0.93 3 P. Orr & Sons 1.68 2.61 4 Wesley High School 2.70 5.31 5 C.I.T. Colony 2.34 7.65 6 Adyar Gate 2.04 9.69 7 Kotturpuram 1.96 11.65 8 Women’s Polytechnic College 2.21 13.86 9 Taramani 1.43 15.29

2

3

FIGURE 1 ROUTE MAP ALONG WITH BUS STOP DETAILS. 4

The collected GPS data include the ID of the GPS unit, time stamp, and latitude and 5 longitude of the location at which the entry was made. Real time communication of this data was 6 made possible through General Packet Radio Service (GPRS). The collected data is stored using 7 Sequential Query Language (SQL) database encompassing all trips in a day. The average 8 headway between two consecutive vehicles in this route is around 45 minutes. The data were 9 collected for every 5 seconds from 6 AM to 8 PM for 45 days in the months of January and 10 February 2013. The distance between two consecutive entries was calculated by using the 11 Haversine formulae (15), which gives the great circle distances between two points on a sphere 12 from their latitudes and longitudes as 13



𝐷𝐷𝐷𝐷𝐷𝐷𝑡𝑡𝑡𝑡𝑡𝑡𝑡𝑡𝑡𝑡(𝑑𝑑) = 2𝑟𝑟 𝑡𝑡𝑟𝑟𝑡𝑡𝐷𝐷𝐷𝐷𝑡𝑡(�ℎ𝑡𝑡𝑎𝑎𝑡𝑡𝑟𝑟𝐷𝐷𝐷𝐷𝑡𝑡𝑡𝑡(∅2 − ∅1 + 𝑡𝑡𝑐𝑐𝐷𝐷∅1𝑡𝑡𝑐𝑐𝐷𝐷∅2ℎ𝑡𝑡𝑎𝑎𝑡𝑡𝑟𝑟𝐷𝐷𝐷𝐷𝑡𝑡(𝜆𝜆2 − 𝜆𝜆1) , (1) 1

where r is the radius of the earth (6378.1 km), ∅1,∅2 indicate the latitude of point 1 and point 2, 2 𝜆𝜆1, 𝜆𝜆2 indicate the longitude of point1 and point 2. Thus, the processed data consists of the travel 3 times and the corresponding distance between consecutive locations for all the buses. The entire 4 section was divided into 150 subsections each of 100m length and the corresponding time taken 5 to cover each subsection was calculated by using the linear interpolation technique. The outliers 6 in the data were removed by keeping the lower bound permissible value as 5th percentile travel 7 time while the higher bound permissible time was taken as 95th percentile travel time for each 8 subsection. 9

METHODOLOGY 10

Artificial Neural Network (ANN) 11 A neural network is a massively parallel distributed processor made up of simple processing 12 units that have a natural propensity for storing experimental knowledge and making it available 13 for later use (16). This model basically tries to replicate how our brain works, its learning process 14 and response to new sets of events. There are three layers in an ANN namely the input layer, the 15 hidden layer and the output layer as shown in Figure 2. A basic unit of the connection is called a 16 neuron, which is connected by other neurons through synaptic weights. Based on the desired 17 target value, the network is trained, i.e., the weight matrix is updated after every iteration of the 18 algorithm. Thus, as the number of iterations increases, the predicted output value matrix shifts 19 closer to that of the target value. 20

21

FIGURE 2: BASIC DIAGRAM OF NEURAL NETWORK OF THE MODEL USED IN THE PRESENT 22 STUDY 23

Since the problem under study is a supervised learning problem, a multi-layer feed-24 forward network with Levenberg-Marquardt back propagation algorithm is used for training. 25 This algorithm is considered to be one of the fastest method for training moderate-sized feed-26 forward neural networks which may range up to several hundred weights (16).It also has an 27



efficient implementation in MATLAB software. A hyperbolic tangent sigmoid function was used 1 as the transfer function for both the hidden layer and the output layer. 2

The ANN model requires a good dataset of the desired output with corresponding inputs, 3 making up the training set. After training, the model is simulated with a new input data setto 4 check its efficacy. The predicted value obtained after simulation and the actual target value are 5 compared and error percentage is found out. The total error is quantified usingMean Absolute 6 Percentage Error (MAPE), which is 7

𝑀𝑀𝑀𝑀𝑀𝑀𝑀𝑀 =∑

𝑋𝑋𝑝𝑝−𝑋𝑋𝑡𝑡𝑋𝑋𝑡𝑡

∗100𝑁𝑁𝐷𝐷=1

𝑁𝑁, (2) 8

where𝑥𝑥𝑡𝑡 is the actual value and 𝑥𝑥𝑝𝑝 is the predicted value. It is decided to use the six most 9 correlated trips’ travel time on that particular stretch for each day as the inputsfor the neural 10 network. Thus, the numbers of nodes in the input and output layers are6 and 1 respectively. The 11 median analysis of trip acceptance ratio is used to select the number of inputs for the neural 12 network model. The acceptance ratios of day wise pattern and trip-wise pattern are arranged in 13 descending order in Table 2a. Now, the median of correlated value for all the seven days are 14 analyzed which gives a value of 75% and above for top 6 correlated trips which is acceptable. 15 One hidden layer with 7 neurons is identified as the best option by trial and error and checking 16 error plots. The training rule for the ANN used in the study is 17

𝑥𝑥𝑘𝑘+1 = 𝑥𝑥𝑘𝑘 − [𝐽𝐽𝑇𝑇𝐽𝐽 + 𝜇𝜇𝜇𝜇]−1𝐽𝐽𝑇𝑇𝑡𝑡 , (3) 18

where J is the Jacobian matrix that contains first derivatives of the network errors with respect to 19 the weights and biases, and e is a vector of network errors. The Jacobian matrix can be computed 20 through a standard back propagation technique (16).To test the performance of the above 21 method, seven days’ data from 29th January 2013 to 4th February 2013 were taken. On the other 22 hand, to identify the relevant inputs, a pattern analysis was carried out using two weeks data 23 from29th January 2013 to 12th February 2013. Since, the headway between two consecutive 24 buses is approximately 45 minutes; one trip per every one hour was used from 6 AM to 8 PM. 25 The Z-test for the mean of population of differences for paired samples was conducted for the 26 hypothesis testing at 5% level of significance (14). For analyzing the daily pattern, each trip was 27 compared with the previous seven days’ corresponding trips with the same starting time. For 28 analyzing trip-wise pattern, each output trip was compared with the previous five trips that 29 happened within the same day. A ratio has been calculated between the number of times the 30 claimed null hypothesis is accepted to the total number of times the hypothesis tested for each 31 case. If the acceptance ratio is high, we can conclude that input is significant in predicting the 32 output trip (current trip). The pattern analysis results obtained for all days of the week are shown 33 in Table 2A and the corresponding trips are shown in Table 2B. 34

TABLE 2A: Acceptance ratios as per ranking 35

Sunday Monday Tuesday Wednesday Thursday Friday Saturday Median 0.806 0.839 0.821 0.867 0.857 0.875 0.732 0.839 0.806 0.806 0.821 0.822 0.844 0.857 0.722 0.821 0.768 0.806 0.806 0.800 0.843 0.804 0.714 0.804 0.750 0.806 0.806 0.786 0.800 0.786 0.696 0.786



0.667 0.804 0.786 0.757 0.771 0.778 0.696 0.771 0.643 0.804 0.768 0.756 0.700 0.750 0.694 0.750 0.639 0.786 0.722 0.743 0.689 0.722 0.694 0.722 0.429 0.778 0.696 0.733 0.667 0.661 0.694 0.694 0.375 0.778 0.694 0.700 0.644 0.639 0.661 0.661 0.357 0.732 0.667 0.671 0.629 0.639 0.639 0.639 0.357 0.661 0.643 0.556 0.622 0.583 0.625 0.622 0.232 0.286 0.482 0.386 0.386 0.357 0.589 0.386

1 Table 2B: Pattern Analysis Results for ANN model 2

Rank Sunday Monday Tuesday Wednesday Thursday Friday Saturday 1 t-1 d-7 d-7 t-1 d-7 d-7 d-7 2 t-2 t-1 d-1 t-2 t-1 d-1 t-2 3 d-7 t-2 t-1 d-7 d-1 d-2 d-1 4 t-4 t-3 t-4 d-2 d-6 d-4 d-2 5 t-3 d-2 d-5 d-1 d-2 t-1 d-4 6 d-1 d-5 d-4 t-3 d-3 d-3 t-1 7 t-5 d-4 t-5 d-6 t-2 t-2 t-3 8 d-3 t-4 d-6 t-4 t-3 d-6 t-4 9 d-5 t-5 t-3 d-4 t-5 t-3 d-3 10 d-2 d-3 t-2 d-5 d-5 t-4 t-5 11 d-4 d-6 d-3 t-5 t-4 t-5 d-6 12 d-6 d-1 d-2 d-3 d-4 d-5 d-5

* (d-n) represents previous nth day same time tripand (t-n) represents previous nth trip within the same day. 3

From this, the six most relevant trips were used as inputs for each day (75% or more 4 acceptance ratio). Data were normalized before using based on the formula 5

𝑥𝑥𝑘𝑘(𝐷𝐷)−𝑚𝑚𝐷𝐷𝑡𝑡 𝑘𝑘(𝐷𝐷) 𝑚𝑚𝑡𝑡𝑥𝑥 𝑘𝑘(𝐷𝐷)−𝑚𝑚𝐷𝐷𝑡𝑡 𝑘𝑘(𝐷𝐷)

, (4) 6

where𝑥𝑥𝑘𝑘(𝐷𝐷) is the ith element of column ‘x’ in the dataset, and 𝑚𝑚𝑡𝑡𝑥𝑥𝑘𝑘(𝐷𝐷) and 𝑚𝑚𝐷𝐷𝑡𝑡𝑘𝑘(𝐷𝐷) are 7 respectively minimum and maximum value of that particular data considering training and 8 testing datasets. The training stops when any of the following conditions are met: 9

1. The maximum number of epochs (repetitions) is reached, which was chosen as 600. 10 2. The maximum training time has exceeded. 11 3. Performance has been minimized to the goal which is taken as 0.001. 12 4. The performance gradient falls below minimum gradient which is 1e-10. 13 5. The value of ‘mu’, which signifies the rate of convergence of adaptive algorithm, exceeds 14

its maximum which is 1e10. 15

One of the problems in ANN training is over fitting. Over fitting reduces the generalizing 16 capability of the network resulting in inflation in the output value with unseen data. To curb the 17 problem of over fitting, regularization of the performance function was carried out. This was 18 done by adding a term that consists of the mean of the sum of squares of the network weights 19



and biases (msw) to the typical performance function, Mean Square Error (MSE). Thus, the new 1 performance function become 2

New performance function = αmse + (1 − α) msw, (5) 3

In this study, the value of α is taken as 0.5. Since the weight matrix is randomly 4 initialized for every run, a total of 7 runs were carried out for each training and from these seven 5 runs, an ensemble averaging method which is a type of committee machine, is used to find out 6 the final output. In this method, the output produced by 7 different trained networks are 7 combined and linearly averaged. The motivation of using this technique is due to the fact that 8 differently trained networks (caused by random weights) converge to different local minima on 9 the error surface, and the overall performance is improved by combining the outputs (16). 10

Model Based Approach 11 A model based approach using KFT was used in this study for comparing the performance with 12 ANN. The KFT (17) can be used to estimate state variables, which are used to characterize 13 systems/processes that are described by state space models. The implementation of the Kalman 14 filter requires information regarding the system’s dynamics, statistical information of the system 15 disturbances and measurement errors. It uses the model and the system inputs to predict the a 16 priori state estimate and uses the output measurements to obtain the a posteriori state estimate. 17 Overall, it is a recursive algorithm, so that new measurements can be processed when they are 18 obtained. It needs only the current instant state estimate, current input and output measurements 19 to calculate next instant’s state estimate. The selected approach (12) has minimal data 20 requirement and the aim of the study is to find out whether the use of a large data base and a 21 corresponding data driven approach improve the prediction accuracy significantly. The evolution 22 of travel time between the various subsections was assumed to be 23 24

𝑥𝑥(𝑘𝑘 + 1) = 𝑀𝑀𝑥𝑥(𝑘𝑘) + 𝑤𝑤(𝑘𝑘), (6) 25

where, A is a parameter which relates the travel time in the kth subsection to the travel time in the 26 (k+1)th subsection, 𝑥𝑥(𝑘𝑘) is the travel time taken for covering the given subsection k and 𝑤𝑤(𝑘𝑘) is 27 the associated process disturbance with the kth subsection. The measurement process was 28 assumed to be governed by 29

𝑧𝑧(𝑘𝑘) = 𝑥𝑥(𝑘𝑘) + 𝑎𝑎(𝑘𝑘),(7) 30

where𝑧𝑧(𝑘𝑘) is the measured travel time in a given subsection k and 𝑎𝑎(𝑘𝑘) is the measurement 31 noise. It was further assumed that 𝑤𝑤(𝑘𝑘)and 𝑎𝑎(𝑘𝑘)are zero mean white Gaussian noise signals with 32 𝑄𝑄(𝑘𝑘) and 𝑅𝑅(𝑘𝑘) being their corresponding variances. Thus, two sets of data are required to 33 implement the above scheme - one set of data for the time update equations to calculate the 34 parameter ‘A’ and another data set to be used in the measurement update equations to generate 35 the a posteriori travel time estimate. So, the data obtained from first bus (PV1) were used to 36 obtain the value of A for each subsection and the data from second bus (PV2) were used to obtain 37 the a posteriori travel time estimate of next vehicle (TV). 38

The steps in the algorithm were as follows: 39



1. The entire section of travel between origin and destination was divided into N 1 subsections of equal length (100 m). 2

2. The travel time data from PV1 was used to obtain the value of A by making the 3 assumption that there exists a relation between neighboring subsections and the relation 4 was given as 5

𝑀𝑀 = 𝑋𝑋𝑀𝑀𝑃𝑃 1(𝑘𝑘+1)𝑋𝑋𝑀𝑀𝑃𝑃 1(𝑘𝑘)

, k = 1,2,3, … … . . (n − 1) (8) 6

3. Let 𝑥𝑥𝑇𝑇𝑃𝑃be the travel time taken by the test vehicle (the vehicle for which the travel time 7 needs to be predicted) to cover the given subsection. For the first section, one cannot 8 predict the travel time of test vehicle (TV). So, it was assumed that 9 10 𝑀𝑀[𝑥𝑥𝑇𝑇𝑃𝑃(1)] = 𝑥𝑥�(1), (9) 11

𝑀𝑀[𝑥𝑥𝑇𝑇𝑃𝑃(1) − 𝑥𝑥�(1)2] = 𝑀𝑀(1), (10) 12

where𝑥𝑥�(𝑘𝑘) is the estimate of travel time of the TV in the kth subsection. 13 14

4. For 𝑘𝑘 = 2,3, … , (𝑁𝑁 − 1), the following steps were performed: 15 a. The priori estimate of the travel time was calculated by using 𝑥𝑥�−(𝑘𝑘 + 1) =16

𝑀𝑀𝑥𝑥�−(𝑘𝑘) , where the superscript ′ − ′ denotes the a priori estimate and the 17 superscript ′ + ′ denotes the a posteriori estimate. 18

b. The a priori error variance (denoted by𝑀𝑀−) was calculated using 19

𝑀𝑀−(𝑘𝑘 + 1) = 𝑀𝑀𝑀𝑀+(𝑘𝑘)𝑀𝑀 + 𝑄𝑄(𝑘𝑘),(11) 20

c. The Kalman gain (denoted by K) was calculated by using 21

𝐾𝐾(𝑘𝑘 + 1) = 𝑀𝑀−(𝑘𝑘 + 1)[𝑀𝑀−(𝑘𝑘 + 1) + R(k + 1)]−1(12) 22

d. The a posteriori travel time estimate and error variance were calculated using, 23 respectively, 24

x�+(k + 1) = x�−(k + 1) + K(k + 1)[z(k + 1) − x�−(k + 1)](13) 25

P+(k + 1) = [I − K(k + 1)]P−(14) 26

Thus, the objective here is to predict the travel time of the TV using the travel time 27 obtained from previous two bus data (PV1 and PV2)when the TV is in the kth subsection. 28

RESULTS AND DISCUSSIONS 29 A comparison of performance of the proposed methods was carried out section wise as well as 30 trip wise. The prediction accuracy was calculated using MAPE. The performance of ANN and 31 KFT were compared using the testing data set from 29th January to 4th February, 2013. The bus 32 travel time for each 100 m subsection of the entire section was predicted using this scheme. It 33 has to be noted that for this scheme, the data from the previous 2 buses alone are required as 34 inputs. 35

Figure3 shows a sample result where the predicted travel time from ANN is shown 36 against the actual travel time for one particular section. It can be observed that the prediction 37



values are following the trends in the measured data. The MAPE for this case was observed to be 1 23.23%. Figure 4shows the corresponding MAPE for all the subsections. A similar analysis was 2 carried out using the model based approach and the results obtained are also shown in Figure 4. 3 The error varies from 13.93% to 44.52% for ANN and 16% to 42% for the model based 4 approach. It can be seen that the results are comparable for both the data driven and model based 5 approaches, with a slight advantage to ANN. However, considering the data base requirement for 6 ANN and the minimal data requirement for model based approach, the performance of both can 7 be considered comparable. Especially with agencies that have a minimal data base, the model 8 based approach can be a viable option. 9

10

11

FIGURE 3 MEASURED TRAVEL TIME AND PREDICTED TRAVEL TIME USING ANN. 12

13

FIGURE 4 COMPARISON BETWEEN MACHINE LEARNING AND MODEL BASED APPROACHES. 14

The performance was also checked for specific trips across the stretch. Sample results on 15 31st January 2013 for one peak and one off-peak trips’ using ANN are shown in Figures 5A and 16 5B. It can be observed that the model performs well for all the subsections for both peak and off-17 peak conditions. Corresponding plots using model based approach are shown in Figure 6A and 18

0

10

20

30

40

50

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23

MA

PE

Subsection Index

Machine learning approach (ANN) Model based approach (KFT)



6B. Errors were quantified using MAPE and are shown in Table 3 for various trips on a selected 1 day. It can be observed that ANN performed better than KFT in most of the cases. 2

3

FIGURE 5A PEAK TIME PERFORMANCE – TRIPWISE USING ANN. 4

5

FIGURE 5B OFF PEK TIME PERFORMANCE – TRIP WISE USING ANN. 6

7

FIGURE 6APEAK TIME PERFORMANCE – TRIPWISE USING MODEL BASED APPROACH. 8

0.00

50.00

100.00

150.00

200.00

250.00

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24

Tra

vel t

ime

(sec

.)

Subsection index with each 500m length

Measure value Predicted value

0.00

50.00

100.00

150.00

200.00

250.00

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24

Tra

vel t

ime

(sec

.)


Measured Value Predicted Value

0

50

100

150

200

250

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23

Tra

vel t

ime

(sec

.)


Measured value Predicted value



1

FIGURE 6BOFF-PEAK TIME PERFORMANCE – TRIPWISE USING MODEL BASED APPROACH. 2

Table3 MAPE calculated for various trips on 31st January 2013 3

Starting time of trip

Machine learning approach

Model based approach

7.53 16.42 13.42 8.18 15.67 19.29 9.37 25.14 47.92 10.59 15.53 27.16 14.20 36.92 38.37 16.16 19.47 33.14 18.25 15.59 23.71 19.04 28.39 30.82 19.24 35.30 29.60

SUMMARY AND CONCLUSIONS 4 The main aim of Advanced Public Transportation System (APTS) is to attract passengers 5 towards public transport, and thus help to reduce the congestion on urban roads. However, for 6 this to happen in practice, the bus service should be made more attractive and one option for that 7 is to provide reliable information about bus arrival to passengers. However, the reliability of the 8 information provided to passengers mainly depends on the prediction technique used and the 9 input data used in the same. The present study compared the performance of two commonly used 10 prediction techniques, one data driven and the second with minimal data requirement for bus 11 arrival prediction. The data driven technique selected is the Artificial Neural Network method 12 and a model based approach using Kalman Filtering Technique with minimal data requirement 13 of just two previous buses is used for less data demanding technique. To make sure that the data 14 used is the most significant one, pattern analysis using statistical methods was carried out. The 15 performances of both methods are evaluated using data collected for a period of one week from 16 the field. The comparison of both methods showed a better performance from ANN compared to 17 the model based KFT. However, the caveat of using ANN over KFT is the requirement of a large 18 dataset for network training. On the other hand, the model based KFT, which gave comparable 19

0

50

100

150

200

250

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24

Tra

vel t

ime

(sec

.)


Measured Value Predicted Value



performance, require very minimal data and hence will be suitable in cases where a good data 1 base is not available. 2 3 ACKNOWLDGEMENT 4 The authors acknowledge the support for this study as a part of the sub-project CIE/10-5 11/168/IITM/LELI under the Centre of Excellence in Urban Transport project funded by the 6 Ministry of Urban Development, Government of India, through letter No. N-11025/30/2008-7 UCD. 8 9 REFERENCES 10

1. Schweiger, C.L. Real-time bus arrival information systems. In Transportation Research 11 Board, TCRP synthesis 48, 2003. 12

2. Jeong, R. and L.R. Rilett. Bus Arrival Time Prediction using Artificial Neural Network 13 Model. In IEEE Intelligent Transportation Systems conference, Washington, D.C., 2004, 14 pp. 988-993. 15 16

3. Wang, J., X. Chen, and S. Guo.Bus travel time prediction model with v - support vector 17 regression. In IEEE Intelligent Transportation Systems 3(1), Washington, D.C., 2009, pp. 18 655-660. 19

4. Liu, H., R. He, K. Zhang, and J. Li. A neural network model for travel time prediction. In 20 Proceedings of IEEE International Conference on Intelligent Computing and Intelligent 21 Systems, 2009, pp.752-756. 22

5. Chien, S.I.J., Y. Ding, and C. Wei. Dynamic Travel time Prediction with Artificial Neural 23 Networks. In Journal of Transportation Engineering 128, No. 5, 2002, pp.429-438. 24

6. Park, D., L.R. Rilett, and G. Han. Spectral basis neural networks for real-time travel time 25 forecasting. In Journal of Transportation Engineering, 125(6), 2002, pp.515-523. 26

7. Dailey, D.J. An Algorithm for Predicting the Arrival Time of Mass Transit Vehicles 27 using AVL data. In Proceedings of the 78th Annual Meeting, (CD-ROM). Transportation 28 Research Board of the National Academics, Washington, D.C., 1999. 29

8. Cathey, F.W. and D.J. Dailey. A Prescription for Transit Arrival/Departure Prediction 30 using AVL Data. In Transportation Research Part C: Emerging Technologies, No.11, 31 2003, pp. 241-264. 32

9. Shalaby, A. and A. Farhan. Prediction models of bus arrival and departure times using 33 AVL and APC data. In Journal of Public Transportation, Vol.7, No.1, 2004, pp. 41-60. 34

10. Nanthawichit, C., T. Nakatsuji, and H. Suzuki. (2003). Application of Probe vehicle data for real time traffic state estimation and short time travel time prediction on a freeway. In Transportation Re-search Board, (CD-ROM), Washington, D.C., 2006.

11. Ramakrishna, Y., V. Lakshmanan, andR. Sivanandan. Bus travel time prediction using 35 GPS data. http://www.gisdevelopment.net/proceedings/mapindia/2006 /student%20oral 36 /mi06stu_html. Accessed Feb, 12, 2013. 37

12. Vanajakshi, L., S.C. Subramanian, andR. Sivanandan. Travel Time Prediction under 38 Heterogeneous Traffic Conditions using GPS Data from Buses. In IET Journal on 39 Intelligent Transportation Systems 3 (1), 2009, pp. 1-9. 40

13. Padmanabhan, R.P.S., K. Divakar, L. Vanajakshi, and S.C. Subramanian. Development 41 of a Real-Time Bus Arrival Prediction System for Indian Traffic Conditions. In IET 42 Journal on Intelligent Transportation Systems 4 (3), 2009, pp. 189-200. 43



14. Kumar, S.V. and L. Vanajakshi. Pattern identification based bus arrival time prediction. 1 In Proceedings of the Institution of Civil Engineers – Transport, 2011, Paper. 1200001. 2

15. Chamberlain, R.G. Great Circle Distance between Two Points. http://www.movabletype. 3 co.uk/ scripts/gis-faq-5.1.html. Accessed Feb, 14, 2013. 4

16. Simon, H. Neural networks, A comprehensive foundation: Pearson education group, Inc., 5 New York, 1999. 6

17. Kalman, R.E. A New Approach to Linear Filtering and Prediction Problems. In Transaction of the ASME-Journal of Basic Engineering 82(1), 1960, pp.35-45.


Documents

Comparison of Model Based and Machine Learning Approaches ... · 1 Comparison of Model Based and Machine Learning Approaches for 2 Bus Arrival Time Prediction 3 4 Vivek Kumar 5 Under