06552441-libre

Embed Size (px)

Citation preview

  • 8/12/2019 06552441-libre

    1/5

    ROAD TRAFFIC PREDICTION USING BAYESIAN

    NETWORKS

    Poo Kuan Hoong, Ian K. T. Tan, Ong Kok Chien, Choo-Yee Ting

    Faculty of Computing and Informatics, Multimedia University, Cyberjaya, Malaysia.

    {khpoo, ian, ong.kok.chien08, cyting}@mmu.edu.my

    Keywords: Road Traffic Prediction, Context Aware,

    Personalized, Bayesian Networks.

    Abstract

    Having prior road condition knowledge for planned orunplanned journeys will be beneficial in terms of not only

    time but potentially cost. Being able to obtain real-timeinformation will further enhance these benefits. Current

    systems rely on huge infrastructure investments by

    governments to install cameras, road sensors and billboards to

    keep motorists informed. These efforts can only be, at best,

    available at pre-identified hotspots. Radio broadcast is an

    alternative, where they rely on reports by other motorists.

    However, such reports are often delayed and not tailored to

    individual motorist. Seeing the limitations of existing

    approaches to obtain real-time road conditions, this researchwork leverages on mobile devices that provide context

    sensitive information to propose a predictive analytics

    framework based on a Bayesian Network for road condition

    prediction. This paper aims to contribute to (i) defining a setof evidences (variables) that could potentially be utilized for

    road condition prediction and (ii) construction of a BayesianNetwork model to predict road conditions. In conclusion, we

    presented a novel approach to provide potentially unlimited

    coverage of road traffic conditions with substantially reduced

    infrastructure investments.

    1 Introduction

    Knowing the road traffic conditions during travelling would

    be advantageous in deciding on alternative routes or even

    additional planned detours. This is evident in our daily life

    where radio stations compete for listening audience byensuring that they provide regular traffic reporting, especially

    during peak traffic hours. However, these are typically

    delayed or incorrect and largely dependent on difficult to

    verify third party resources such as listeners calling into the

    radio stations to report on the traffic.

    City councils and local governments invest and placesignificant financial resources to deliver traffic reports

    through installation of webcams and traffic billboards [10]

    [7]. However, not only do these efforts require significant

    investment and expensive on-going maintenance, users are

    required to actively interact with the system to obtain

    localized information that are relevant.

    With the proliferation of smart phones with Global

    Positioning Systems (GPS) as well as Internet connectivity,

    this poses an opportunity to research into Awareness

    Information Environments in Mobile Computing [14] and

    develop a framework that is suitable for road traffic reportingin real-time. In a comprehensive survey done by

    Anagnostopoulos et. al. [1], they described a 3-dimensional

    model space for context-aware system where the threedimensions are context model, sensor centric support, and

    system behaviour. In this research, we propose to map our

    context sensitive predictive analytics approach on this model

    where the context data will comprise of more than just

    location [13] but to also include vehicle direction, speed,

    altitude, drivers information and vehicle type. In our model,

    the 2nd

    dimension of sensor centric support will be in the form

    of weather, maps and road design information. From thesecontext data and sensor centric support, the overall road

    traffic system behaviour will be derived through real-time

    predictive analytics.

    Research has shown that the presence of unexpected incidentscause the inaccuracy of the forecasted traffic condition[21][16][11]. Such incidents include sudden change in traffic

    flow, weather conditions (e.g., rainy or snow) [2], road

    conditions, accidents [18][8][17], road works or road

    constructions [18]. According to statistic provided by Federal

    Highway Administration, United States, 75% of weather-

    related vehicle crashes occur on wet pavement and 47%

    happen during rainfall. Besides that, [9] states that rain can

    increase the crash rate by 71% and the injury rate by 49%. In

    addition to the unexpected incidents on the road, inaccuracy

    of a predictive model is also due to the insufficient data for

    training the model [16]. Such data is often difficult to collect.

    Various predictive analytics approaches have been proposed

    by researchers to address the challenges in traffic flow

    prediction. Prediction methods can vary from very simple to

    complex versions. Examples of simple ones are the random

    walk (RW), which is solely based on the information aboutcurrent traffic conditions; the historical average (HA)

    approach utilizes average flow rates for prediction of current

    flow rates; the informed historical average (IHA) combines

    RW and HA to predict the current traffic flow rates. These

    methods have been proven to work in specific situations [19].

    Other complex methods include time series models such as

    Autoregressive Integrated Moving Average (ARIMA) and

    seasonal ARIMA [19]. In addition, data mining approaches

    such as Artificial Neural Network (ANN), simulation,

  • 8/12/2019 06552441-libre

    2/5

    regression, fuzzy-neural, and Markov chain model have been

    employed for prediction of traffic flow

    [3][4][5][6][12][20][22]. The above data mining methods not

    only require dataset for model training, the accuracy drops

    when incomplete evidence is fed into the model. Therefore,

    this research work proposes Bayesian Networks (BN) as analternative to address the incomplete information in road

    traffic condition prediction.

    Recent studies have shown that BN have been employed as

    alternative solution to the challenges faced when modelling

    and predicting road traffic conditions [15] [23]. For instance,

    the work by [15] employed BN to traffic flow forecasting in

    the city of Beijing. In the study, BN are mapped to the citys

    roads with arcs of the network represent the traffic flow andthe weight of the arc represents the volume of traffic flow.

    That is the higher the volume, the strength of arc will

    increase. Based on the dataset of Beijing traffic information,

    the BN model was constructed using the ExpectationMaximization (EM) algorithm. The performance of the model

    was then evaluated using another set of Beijing traffic

    information to elicit its accuracy. The original proposed

    model was later enhanced into a spatio-temporal BN [16],

    where the network incorporates background information such

    as peoples activities around shopping centres, car parks and

    home communities.

    In a study by Zheng et al. [23], a combination of a BN and a

    Neural Network was employed. The model was trained andtested using the dataset that comprises of the 15-min time

    interval traffic information gathered from the Singapores

    Ayer Rajah Expressway. The evaluation results have shownthat the combined model has outperformed predictors that

    merely consist of two Neural Networks. While the above two

    studies employed BN for short-term prediction of traffic flow,

    the study by [11] employs Dynamic Bayesian Networks

    (DBN) to predict the traffic flow in real-time mode. The

    technique used was multi-regression dynamic model (MDM),

    which aims at preserving the conditional independences and

    causal drives that were exhibited by the traffic flow series.

    The advantage of employing a DBN is in its ability to cater

    for real-time changes rather than one time short-termprediction. In this study, the model was trained and tested

    using the dataset comprises of traffic flow collected inLondon for six months duration.

    2 Bayesian Network (BN) Model

    Figure 1: Proposed Bayesian Network model for Road Condition Prediction (M1)

    In this research work, road conditions could only bepredicted in light of receiving the information about the

    traffic from various sources [21], which ultimately aim at

    predicting the likelihood of traffic jams. More than often,

    acquiring reliable information from such events can be

    uncertain and therefore, the accuracy of forecasted trafficconditions could be varying [21][16][11]. Therefore,

    maximizing the number of events as evidence was an

    important phase when building the BN model.

    Figure 1 depicts the proposed BN model for prediction of

    road conditions in this research work (hereafter refer asM1). There are two main sub-networks, namely,Event sub-

    networkandRoad Condition sub-network. TheEvent sub-networkconsists of nodes that represent the events where

    information about road conditions could be obtainedmainly from micro-blogging, Twitter traffic tweets. Such

    events are referred to as evidential nodes in BNs. The

    events (evidential nodes) identified in this research work

    are weather web services, tweet for Bad weather, tweet for

    accident, low travel speed, tweet for road block,announcement for road block, tweet for road construction,

    announcement for road construction, and tweet for others.

    The second sub-network is the Road Condition sub-

    network. The nodes in this sub-network are Bad weather

    condition, Accident, Road Block, Road Construction, andOthers. These nodes are referred to asIntermediate nodes.

    Arcs are directed from intermediate nodes to the evidentialnodes so that likelihood of traffic jams can be indirectly

    Weather

    web

    service

    Tweet for

    bad

    weather

    Tweet for

    accident

    Low Travel

    Speed

    Tweet for

    road

    block

    Road block

    announcement

    Tweet for

    road

    construction

    Road

    construction

    announcement

    Tweet for

    others

    Bad weather

    conditionAccident Road block

    Road

    constructionOthers

    Traffic jam

    Event sub-network

    Road condition sub-network

  • 8/12/2019 06552441-libre

    3/5

    inferred from the states instantiated to the evidentialnodes.

    For instance, the probability of traffic jam will increase, if

    the probability of accident increases. However, the

    probability of accident depends on whether or not the

    inference engine receives a tweet about accident.

    Similarly, in light of receiving a tweet about road blockand an announcement about road block, the probability of

    Road Block will increase, which in turn increase the

    chance of traffic jams. The information about road blockin Malaysia can be obtained via Twitter posts from Twitter

    accounts including @KLroadblock, @TrafficDotMy,

    @kltrafficupdate, @kltraffic, @LLMinfotrafik,

    @amptraffic, @LEKAStrafik and @plustrafik. Statistics

    shows that 17 out of 245 records (6.94%) of the tweets

    about traffic jams are caused by road block at particularlocations. As shown in Figure 1, there is an arc directed

    from traffic jam to low travel speed. The arc suggested that

    low travel speedhas direct influence to the occurrence of

    traffic jam.

    3 Implementation

    This study began with collection of events as evidence to

    BN. The identified events have included road traffic, road

    construction, road block, and weather from various online

    resources like online websites announcements, weather

    web services and social networking updates especiallyTwitter tweets. The preliminary evaluation of proposed

    BN model was restricted to Twitter tweets and users

    reports as main sources of evidences.

    The Bayesian Network model construction was performedby employing GeNIe & SMILE. SMILE stands forStructural Modelling, Inference, and Learning Engine. It

    consists of C++ library classes implementing graphical

    decision-theoretic methods, such as BN and influence

    diagrams, directly amenable to inclusion in intelligent

    systems. GeNIe is a Windows-Based User Interface

    application for graphical decision-theoretic models. In

    short, SMILE is the engine and the GeNIe is the simulator

    that helps to construct the BN and influence diagram. Both

    modules were developed at the Decision Systems

    Laboratory, University of Pittsburgh.

    Preparing the dataset for evaluation of proposed BNshown in Figure 1 was not a trivial task in this research

    work. This is mainly due to the fact that there is no

    standard dataset that can be used directly to evaluate the

    predictive accuracy of the network. In this light, this

    evaluation phase began with preparation of dataset, which

    was generated by randomly assigned values to each of the

    features. From the randomly generated 1000 records, 200

    records that resembled the experts' beliefs were

    handpicked by human experts to form a dataset for training

    and testing of proposed BN model for traffic prediction.

    The evaluation process began with creation of two

    variations of BN, namely, Nave Bayesian Network (M2)and parameter-learning Bayesian Network (M3), as

    Figure 2: Sample Scenario of Road Information for Jalan

    Duta and Joint Probability Expression

    benchmark to measure the predictive accuracy of the

    proposed Bayesian Network (M1), particularly when no

    human intervention were involved in the creation process.

    Before the models can be evaluated, pre-processing of raw

    data and training of models are required. Two C++

    programs, namely, FILTER and TESTER were developed

    to perform these works. FILTER is the program that pre-

    processes data by matching the case criteria (with filtering

    rules) that we are interested and divide them into 2

    separated files, 20% for testing and 80% for training. The

    filtering rules: (1) if there is an evidence of particularevent/incident, then the event must occur; (2) if there is no

    evidence of particular event/incident, the event may occur

    as well; (3) if all event including others did not occurs

    then the traffic jam does not occur.

    TESTER is the main program that evaluates the accuracy

    of the data. This programme utilizes the SMILE library,

    which allows us to perform some features of BN such as

    parameter learning on top of an existing model andgenerate Nave Bayes model from datasets. Besides that,

    this program is also able to read the dataset in .csv format

    and set the evidence(s) into the model to get the final

    results such as whether a traffic jam is likely to occur.

    Sample Scenario

    Theres a tweet about the about Heavy Rain at Jalan Duta,

    According to the weather web service, weather condition is

    bad at Jalan Duta,

    Theres no tweet about accident at Jalan Duta,

    The vehicles moving slow at Jalan Duta,

    Theres no roadblock and road construction,

    Theres no other tweet about Jalan Duta.

    Joint Probability Expression:

    P(TrafficJam=YES | BadWeatherCondition=YES) x

    P(BadWeatherCondition=YES | WeatherWebService=YES ,

    TweetForBadWeather=NO ) x

    P(TrafficJam=YES | Accident=NO) x

    P(Accident=NO | TweetForAccident=NO) x

    P(TrafficJam=YES | LowTravelSpeed=YES) x

    P(TrafficJam=YES | RoadBlock=NO) x

    P(RoadBlock=NO | TweetForRoadBlock=NO ,

    AnnouncementForRoadBlock=NO) x

    P(TrafficJam=YES | RoadConstruction=NO) x

    P(RoadBlock=NO | TweetForRoadConstruction =NO,

    AnnouncementForRoadConstruction=NO) x

    P(TrafficJam=YES | Others=NO) x

    P(Others=NO | TweetForOthers=NO)

  • 8/12/2019 06552441-libre

    4/5

    In order to evaluate the accuracy of M1, three experiments

    were conducted. In the first experiment, comparison was

    made between M1, M2, and M3 (with five iterations of

    parameter learning). This experiment involved five

    different datasets.

    In the second experiment, M1 was compared with M2 using

    10 different datasets. Lastly, the third experiment was

    conducted in order to determine the influences of the sizeof the dataset towards the accuracy of the model

    performance.

    4 Results and Discussion

    Figure 3: Comparing M1, M2, and M3 with 5 datasets

    Figure 3, 4 and 5 represent the results for the three

    experiments conducted. As shown in Figure 3, the value of

    M3 remains constant after the first iteration as the

    parameter had reached its own optimal level. Hence, there

    was no need to perform iteration learning on top of M3 in

    the real working environment.

    Figure 4: Performance comparison of M1 and M2 with 10

    datasets

    In terms of accuracy of road traffic prediction, we

    measured the number of matches of testing result with

    actual result and the percentage of matches. M2 depicted

    the lowest average accuracy (58.57%), followed by M1(74.37%) while M3 scored the highest accuracy of 76.01%.

    As depicted in Figure 4, M1 consistently outperforms M2regardless of the dataset size. The average of accuracy for

    M1 was 72.10%, having a 5.13% higher than the averageaccuracy for M2.

    Figure 5: Accuracy Comparison between M1, M2, and M3using 3 datasets

    Figure 5 shows the result for the third experiment. From

    our observation, the size of datasets does not influence theaccuracy level of the BN Model (M3) for road traffic

    prediction. From our observation, the smallest dataset size

    of 170 cases scored the highest accuracy while the medium

    dataset size of 1,700 cases dataset scored the lowest

    accuracy as compared to the largest dataset size of 17,000

    cases.

    5 Conclusion

    The research work presented in this paper had three

    objectives: (1) to identify the variables that can be used to

    infer traffic condition, (2) to identify the relationships

    between variables and traffic jam, and (3) to construct aBN model to represents the relationship among the

    variables to infer traffic jam. We identified the variables

    (unexpected incident) that can be used to infer traffic

    conditions are: accident, weather condition (rain/snowy),

    road work (construction), road block, and speed ofvehicles can be used to infer traffic condition. We

    evaluated our constructed BN (M3) model with Original

    CPT (M1) and Parameter Learning (M2). Based on our

    initial results, our proposed BN model shows promising

    result even though the accuracy of road traffic prediction is

    much lower. For our future works, we intend to improve

    the BN model in order to achieve more accurate road

    traffic prediction.

  • 8/12/2019 06552441-libre

    5/5

    References

    [1] C. B. Anagnostopoulos, A. Tsounis, S.Hadjiefthymiades, Context Awareness in Mobile

    Computing Environments. Journal of Wireless

    Personal Communication. Vol. 42 (3) Aug. 2007, pp445 464, (2007).

    [2] R. Billot, Integrating The Effects Of Adverse

    Weather Conditions On Traffic: Methodology,Empirical Analysis And Bayesian Modelling.

    [Retrieved from]

    http://www.ectri.org/YRS09/Papiers/Session3/Billot_

    R_Session3_Traffic(2).pdf, (2009).

    [3] R. Chrobok, J.Wahle, and M. Schreckenberg,

    Traffic forecast using simulations of large scalenetworks, in Proc. 4th IEEE Int. Conf. Intelligent

    Transportation Systems, Oakland, CA, pp. 434

    439, (2001).

    [4] M. Danech-Pajouh and M. Aron, ATHENA, amethod for short-term inter-urban traffic

    forecasting,INRETS, Paris, France, Tech. Rep. 177,

    (1991).

    [5] G. A. Davis, Adaptive forecasting of freeway traffic

    congestion, Transp. Res. Rec., no. 1287, pp. 29

    33, (1990).

    [6] M. Der Voort, M. Dougherty, and S. Watson,Combining Kohonen maps with ARIMA time series

    models to forecast traffic flow, Transp. Res., Part C

    Emerg. Technol., Vol. 4 (5), pp. 307 318, (1996).

    [7] Dewan Bandaraya Kuala Lumpur (DBKL) Integrated

    Transport Information System. [Online]

    http://www.itis.com.my/atis/index.jsf[8] S.O. John, E.B. Fabian, Distributed or Centralized

    - The Applications Take.

    Proceedings of the 6th Annual IEEE communications

    society conference on Sensor, Mesh and Ad Hoc

    Communications and Networks (SECON09), pp. 709

    718, (2009).

    [9] Q. Lin, W. Nixon, Effects of Adverse Weather on

    Traffic Crashes: Systematic Review and Meta-

    Analysis. In Proceedings of the 87nd annual

    meeting of the Transportation Research Board.

    CDROM. Transportation Research Board of theNational Academies, Washington, D.C., pp. 139

    146, (2008).[10] New Zealand Transport Agency: Auckland Traffic

    Flow. [Online]

    http://www.nzta.govt.nz/traffic/current-

    conditions/webcams/auckland/traffic.phtml

    [11] C. Queen, C. Albers, Intervention and causality:

    forecasting traffic flows using a dynamic Bayesian

    network. Journal of the American Statistical

    Association. Vol. 104(486), pp. 669 681, (2009).

    [12] B. L. Smith and M. Demetsky, Traffic flow

    forecasting: Comparison of modelling approaches,J. Transp. Eng., Vol. 123 (4), pp. 261 266, (1997).

    [13] A. Schmidt, M. Beigl, Hans-W. Gellersen, There is

    more to context than location, Computers &Graphics, Vol. 23 (6), Dec. pp. 893 901, (1999).

    [14] K. Stefanidis, E. Pitoura, Related Work on Context-

    Aware Systems. Work in Progress Report,

    Department of Computer Science, University ofIoannina, Greece. [Retrieved from]

    http://softsys.cs.uoi.gr/deca/deca-survey.pdf, (2001).

    [15] S. Sun, C. Zhang, and G. Yu, A Bayesian networkapproach to traffic flow forecasting, IEEE Trans.

    Intell. Transp. Syst., Vol. 7 (1), pp. 124 131,

    (2006).[16] S. Sun, C. Zhang, and Y. Zhang, Traffic flow

    forecasting using a spatio-temporal Bayesian

    Network predictor. In Artificial Neural Networks:

    Formal Models and Their Applications (ICANN), pp.

    273 278, (2005).

    [17] G.Z. Tan, Z.P. Liu, Y.D. Wang, The Determination

    and Analysis of Traffic Congestion Evacuation

    Priority. 2nd

    IITA International Conference on

    Geoscience and Remote Sensing, pp. 484 487,

    (2010).[18] M. Wachs, Fighting traffic congestion with

    information technology. Issues in Science andTechnology. National Academy of Sciences.

    [Retrieved from]

    http://www.highbeam.com/doc/1G1-93659945.html,

    (2002).

    [19] B. M. William, Modeling and forecasting vehicular

    traffic flow as a seasonal stochastic time series

    process, Ph.D. dissertation, Dept. Civil Eng., Univ.

    Virginia, Charlottesville, VA, (1999).

    [20] H. B. Yin, S. C. Wong, J. M. Xu, and C. K. Wong,

    Urban traffic flow prediction using a fuzzy neural

    approach, Transp. Res., Part C Emerg. Technol.,Vol. 10 (2), pp. 85 98, Apr. (2002).

    [21] J.Y. Young, M.G. Cho, A Short-Term Prediction

    Model for Forecasting Traffic Information Using

    Bayesian Network. Third 2008 International

    Conference on Convergence and Hybrid Information

    Technology (ICCIT), pp. 242 247, (2008).[22] G. Q. Yu, J. M. Hu, C. S. Zhang, L. K. Zhuang, and

    J. Y. Song, Short term traffic flow forecasting based

    on Markov chain model, in Proc. IEEE Intelligent

    Vehicles Symp., Columbus, OH, pp. 208 212,

    (2003).[23] W. Zheng, D. Lee, and Q. Shi, Short-term freeway

    traffic flow prediction: Bayesian combined neuralnetwork approach,J. Transp. Eng., Vol. 132 (2), pp

    114 121, (2006).