9
Accident Analysis and Prevention 63 (2014) 74–82 Contents lists available at ScienceDirect Accident Analysis and Prevention j ourna l h om epage: www.elsevier.com/locate/aap Application of Poisson random effect models for highway network screening Ximiao Jiang , Mohamed Abdel-Aty, Samer Alamili Department of Civil, Environmental & Construction Engineering, The University of Central Florida, Orlando, FL 32816, United States a r t i c l e i n f o Article history: Received 25 June 2013 Received in revised form 2 October 2013 Accepted 25 October 2013 Keywords: Hotspot identification Network screening Empirical Bayesian Full Bayesian Random effect Poisson Log-Normal a b s t r a c t In recent years, Bayesian random effect models that account for the temporal and spatial correlations of crash data became popular in traffic safety research. This study employs random effect Poisson Log- Normal models for crash risk hotspot identification. Both the temporal and spatial correlations of crash data were considered. Potential for Safety Improvement (PSI) were adopted as a measure of the crash risk. Using the fatal and injury crashes that occurred on urban 4-lane divided arterials from 2006 to 2009 in the Central Florida area, the random effect approaches were compared to the traditional Empirical Bayesian (EB) method and the conventional Bayesian Poisson Log-Normal model. A series of method examina- tion tests were conducted to evaluate the performance of different approaches. These tests include the previously developed site consistence test, method consistence test, total rank difference test, and the modified total score test, as well as the newly proposed total safety performance measure difference test. Results show that the Bayesian Poisson model accounting for both temporal and spatial random effects (PTSRE) outperforms the model that with only temporal random effect, and both are superior to the con- ventional Poisson Log-Normal model (PLN) and the EB model in the fitting of crash data. Additionally, the method evaluation tests indicate that the PTSRE model is significantly superior to the PLN model and the EB model in consistently identifying hotspots during successive time periods. The results suggest that the PTSRE model is a superior alternative for road site crash risk hotspot identification. © 2013 Elsevier Ltd. All rights reserved. 1. Introduction The significance of traffic safety is emphasized in the Trans- portation Equity Act for the 21st Century (TEA-21) and the Safe, Accountable, Flexible, Efficient, Transportation Equity Act: A Legacy for Users (SAFETEA-LU). Especially, the SAFETEA-LU requires states to develop Strategic Highway Safety Plan (SHSP) and comprehen- sive Highway Safety Improvement Program (HSIP) guideline to improve highway safety (U.S. DOT, 1998, 2009). Particularly, the HSIP requires states to submit an annual report describing not less than 5 percent of their highway locations exhibiting the most severe safety needs (Section 148(c)(1)(D)). The intent of this pro- vision is on one hand to raise the public awareness of the highway safety needs and challenges in the States, and on the other hand to identify the most hazardous sites that can be effectively improved by implementing countermeasures. In the last a few decades, various methodologies that are based on different traffic safety performance measures have been devel- oped to identify the most hazardous road sites. Conventionally, Corresponding author. Tel.: +1 8653008424. E-mail addresses: [email protected] (X. Jiang), [email protected] (M. Abdel-Aty), [email protected] (S. Alamili). crash frequencies, crash rates and safety indices adjusted by crash severity were employed to identify unsafe sites. Specifically, enti- ties of higher crash count, crash rate or equivalent property damage only (EPDO) are selected as hotspots in which potential safety improvement treatments are needed. The number of hotspots can be determined by pre-specified criteria such as a proportion or threshold. These methods require little data, but are subject to a few critical problems, including the so-called regression-to-the- mean (RTM) issue and the false assumption of a linear relation between crash count and traffic volume, etc. (Hauer, 1997; Alluri, 2008). Alternatively, McGuigan (1981) proposed a method to use potential for accident reduction (PAR) as a measure of crash risk. PAR is defined as the difference between the observed crash count at the entity and the expected crash frequency at similar sites. His successive study (McGuigan, 1982) suggested that PAR is a supe- rior safety performance measure as compared to crash count and crash rate, in terms of the cost-effectiveness of safety improve- ment strategies. However, Maher and Mountain (1988) argued that using crash count as safety performance measure may perform as well as or better than using PAR due to the inaccuracy of estimated expected crash frequency at similar sites. In light of the arguments of traditional methods, the approach based on Empirical Bayesian (EB) adjusted safety performance measures became popular (Hauer, 1997), and have recently been 0001-4575/$ see front matter © 2013 Elsevier Ltd. All rights reserved. http://dx.doi.org/10.1016/j.aap.2013.10.029

Application of Poisson random effect models for highway network screening

  • Upload
    samer

  • View
    215

  • Download
    0

Embed Size (px)

Citation preview

Page 1: Application of Poisson random effect models for highway network screening

As

XD

a

ARRA

KHNEFRP

1

pAftsiHlsvsib

oo

a

0h

Accident Analysis and Prevention 63 (2014) 74– 82

Contents lists available at ScienceDirect

Accident Analysis and Prevention

j ourna l h om epage: www.elsev ier .com/ locate /aap

pplication of Poisson random effect models for highway networkcreening

imiao Jiang ∗, Mohamed Abdel-Aty, Samer Alamiliepartment of Civil, Environmental & Construction Engineering, The University of Central Florida, Orlando, FL 32816, United States

r t i c l e i n f o

rticle history:eceived 25 June 2013eceived in revised form 2 October 2013ccepted 25 October 2013

eywords:otspot identificationetwork screeningmpirical Bayesianull Bayesianandom effect

a b s t r a c t

In recent years, Bayesian random effect models that account for the temporal and spatial correlationsof crash data became popular in traffic safety research. This study employs random effect Poisson Log-Normal models for crash risk hotspot identification. Both the temporal and spatial correlations of crashdata were considered. Potential for Safety Improvement (PSI) were adopted as a measure of the crash risk.Using the fatal and injury crashes that occurred on urban 4-lane divided arterials from 2006 to 2009 in theCentral Florida area, the random effect approaches were compared to the traditional Empirical Bayesian(EB) method and the conventional Bayesian Poisson Log-Normal model. A series of method examina-tion tests were conducted to evaluate the performance of different approaches. These tests include thepreviously developed site consistence test, method consistence test, total rank difference test, and themodified total score test, as well as the newly proposed total safety performance measure difference test.

oisson Log-Normal Results show that the Bayesian Poisson model accounting for both temporal and spatial random effects(PTSRE) outperforms the model that with only temporal random effect, and both are superior to the con-ventional Poisson Log-Normal model (PLN) and the EB model in the fitting of crash data. Additionally, themethod evaluation tests indicate that the PTSRE model is significantly superior to the PLN model and theEB model in consistently identifying hotspots during successive time periods. The results suggest thatthe PTSRE model is a superior alternative for road site crash risk hotspot identification.

. Introduction

The significance of traffic safety is emphasized in the Trans-ortation Equity Act for the 21st Century (TEA-21) and the Safe,ccountable, Flexible, Efficient, Transportation Equity Act: A Legacy

or Users (SAFETEA-LU). Especially, the SAFETEA-LU requires stateso develop Strategic Highway Safety Plan (SHSP) and comprehen-ive Highway Safety Improvement Program (HSIP) guideline tomprove highway safety (U.S. DOT, 1998, 2009). Particularly, theSIP requires states to submit an annual report describing not

ess than 5 percent of their highway locations exhibiting the mostevere safety needs (Section 148(c)(1)(D)). The intent of this pro-ision is on one hand to raise the public awareness of the highwayafety needs and challenges in the States, and on the other hand todentify the most hazardous sites that can be effectively improvedy implementing countermeasures.

In the last a few decades, various methodologies that are basedn different traffic safety performance measures have been devel-ped to identify the most hazardous road sites. Conventionally,

∗ Corresponding author. Tel.: +1 8653008424.E-mail addresses: [email protected] (X. Jiang), [email protected] (M. Abdel-Aty),

[email protected] (S. Alamili).

001-4575/$ – see front matter © 2013 Elsevier Ltd. All rights reserved.ttp://dx.doi.org/10.1016/j.aap.2013.10.029

© 2013 Elsevier Ltd. All rights reserved.

crash frequencies, crash rates and safety indices adjusted by crashseverity were employed to identify unsafe sites. Specifically, enti-ties of higher crash count, crash rate or equivalent property damageonly (EPDO) are selected as hotspots in which potential safetyimprovement treatments are needed. The number of hotspots canbe determined by pre-specified criteria such as a proportion orthreshold. These methods require little data, but are subject to afew critical problems, including the so-called regression-to-the-mean (RTM) issue and the false assumption of a linear relationbetween crash count and traffic volume, etc. (Hauer, 1997; Alluri,2008). Alternatively, McGuigan (1981) proposed a method to usepotential for accident reduction (PAR) as a measure of crash risk.PAR is defined as the difference between the observed crash countat the entity and the expected crash frequency at similar sites. Hissuccessive study (McGuigan, 1982) suggested that PAR is a supe-rior safety performance measure as compared to crash count andcrash rate, in terms of the cost-effectiveness of safety improve-ment strategies. However, Maher and Mountain (1988) argued thatusing crash count as safety performance measure may perform aswell as or better than using PAR due to the inaccuracy of estimated

expected crash frequency at similar sites.

In light of the arguments of traditional methods, the approachbased on Empirical Bayesian (EB) adjusted safety performancemeasures became popular (Hauer, 1997), and have recently been

Page 2: Application of Poisson random effect models for highway network screening

ysis an

miSmcscrmaEstect(itdbMowa

afiecrtiootarawpbwtcimCFe(tfntteafmt

tipc

X. Jiang et al. / Accident Anal

ade available through several safety design and evaluation tools,ncluding the Interactive Highway Safety Design Model (IHSDM),afetyAnalyst and Highway Safety Manual (HSM). The EB methodakes joint use of two clues to the safety of a road site, the observed

rash count at the site and the predicted crash frequency of similarites. The EB estimation procedure assigns a weight to each of theselues based on the strength of the observed crash count and theeliability of the predicted crash frequency. Thus, the safety perfor-ance obtained from the EB method is expected to be more reliable

s compared to traditional methods. Based on the application of theB method, Persaud (1999) proposed a safety performance mea-ure named as the Potential for Safety Improvement (PSI). Unlikehe PAR index, PSI measures the difference between the EB adjustedxpected crash frequency of a site and the corresponding predictedrash frequency for similar sites. Hence, PSI was developed to iden-ify sites that exhibit abnormally high unobserved random effectsi.e., higher potential for safety improvement as compared to sim-lar sites). The EB method often yields estimates of the crash riskhat are nearly equivalent to those from Full Bayesian (FB) proce-ures, but it ignores the uncertainty in the variance of the sites toe studied and the reference population (Carlin and Louis, 2000).oreover, the accuracy of the prediction sometimes rests heavily

n the reliability of the safety performance functions (SPF) thatere fitted by the reference population. However, acquiring a reli-

ble SPF involves considerable labor of data collection and cleaning.More recently, with the development of statistical analysis tools,

few scholars started to employ the FB models for hotspot identi-cation (HSID). For example, Aguero-Valverde and Jovanis (2009)mployed Bayesian multivariate Poisson Log-Normal models forrash severity modeling and hotspots identification. A total of 6353ural two-lane segments were analyzed in their study. Results showhat the multivariate FB model performs very well in identify-ng hotspots. The application of FB models increases the flexibilityf improving on current hotspot identification approaches basedn panel crash data. One major issue in analyzing panel data ishe potential correlations among observations. In general, therere two levels of correlations in the panel data: (1) temporal cor-elations among observations in a specific road site over time,nd (2) spatial correlations among observations of different sitesithin a certain geographical region in the same time period. In theresence of temporal and spatial correlations, traditional modelsased on the independence assumptions of unobserved error termsill produce biased parameter estimates (Greene, 2000). Bearing

his concern, a few models that account for temporal and spatialorrelations were developed. The most frequently used modelsnclude the fixed and random effect Poisson and Negative Bino-

ial models (Shankar et al., 1998; Ulfarsson and Shankar, 2003;hin and Quddus, 2003; Caliendo et al., 2007; Jiang et al., 2013).or example, Shankar et al. (1998) employed both the random-ffect negative binomial and the cross-sectional negative binomialconsidering location and time as covariates) to estimate factorshat affect median crossover accidents in Washington state. Theyound that the random-effect negative binomial outperformed theegative binomial model when spatial and temporal effects areotally unobserved, which is reasonable because geometric andraffic variables are likely to have location-specific effects. Jiangt al. (2013) investigated the performance of random effect Poissonnd random-effect negative binomial models in predicting crashrequency. They found that the random-effect negative binomial

odel accounting for the temporal correlation of crash observa-ions is superior to regular negative binomial and Poisson models.

Despite the significance of the temporal and spatial effects on

raffic safety analysis, very limited studies have accounted for themn hotspot identification (HSID) analysis. Huang et al. (2009) pro-osed a full Bayesian (FB) hierarchical modeling approach in trafficrash HSID. Two FB hierarchical Poisson Log-Normal models were

d Prevention 63 (2014) 74– 82 75

developed to account for the constant temporal correlation and theserial correlation between successive years of each specific roadsite, respectively. The results show that the FB hierarchical modelssignificantly outperformed the standard EB approach in identify-ing hotspots. However, their study did not take into account thespatial correlation of road sites within the same corridor. More-over, the evaluation of methods in their research was based on thefalse assumption that crash rates can be used to recognize “true”hotspots. Hence, the study of random effect models in HSID leavesmuch to be desired.

The objective of this paper is to investigate the significance ofconsidering temporal and spatial correlations in micro-level HSID.FB Poisson Log-Normal models with spatial and temporal randomeffects were developed. PSI was employed as the measure of crashrisk to rank hotspots. The performance of the FB random effect mod-els in identifying hotspots was compared with regular FB PoissonLog-Normal models and EB method.

2. Methodologies

2.1. Empirical Bayesian method

The EB expression in Highway Safety Manual (HSM) can bederived from the FB Poisson-Gamma distribution. Assume the crashfrequency of site i follows{

P(yi)∼Poisson(�i)

P(�i)∼Gamma(˛, ˇ)(1)

Given n years of observed data is available in site i (denoted byyi), the posterior mean of �i can be obtained as (Hoff, 2009):

E(�i|yi1. . .yin, ˛, ˇ) = ˇ

+ n· ˛

ˇ+ 1

+ n· yi (2)

By definition (Carlin and Louis, 2000), the EB approach usesthe observed data in similar sites to estimate these final stageparameters in the hierarchical models, and then proceeds as thoughthe prior were known. Under the Poisson–Gamma assumption,the mean crash frequency of site i, �i follows the gamma dis-tribution as expressed in Eq. (1). Hence, the hyper-parameters ˛and can be expressed as: (˛/ˇ) = E(�i), (˛/ˇ2) = var(�i), and then(var(�i)/E(�i)) = (E(�i)/a). Substituting these expressions into theposterior equation, we obtain,

E(�i|yi, ˛, ˇ) = 11 + ((n × E(�i))/a)

· E(�i)

+(

1 − 11 + ((n × E(�i))/a)

)· yi (3)

Define k = (1/˛), where is the dispersion parameter of thePoisson-Gamma model. It is noteworthy that k is named as theoverdispersion parameter in the HSM and many other publications.Define � = E(�i), the predicted crash frequency for site i, which iscalculated from safety performance functions (SPF). Then,

ω = 11 + k ∗ n ∗ �

(4)

Accordingly, PSI for the EB method can be derived as

PSI = E(�i|yi, ˛, ˇ) − E(�i) (5)

Traditionally, the SPF was fitted with historical data from ref-erence population. In this paper, a local specific SPF for fatal and

injury (FI) crashes was developed, i.e., the authors employed thesame data in the study area to develop the SPF. Thus, the developedSPF is deemed as the most accurate one for the study area. In otherwords, the conventional EB method that based on the SPFs obtained
Page 3: Application of Poisson random effect models for highway network screening

7 ysis an

fptTGS

2

daPffGshaciNfl(

mεP⎧⎪⎨⎪⎩w�tspnvcm

tcsdte⎧⎪⎨⎪⎩wiı

mcwiw

6 X. Jiang et al. / Accident Anal

rom the reference population may not produce as accurate crashrediction as this one. The logarithms of the average annual dailyraffic (AADT) and segment length were considered as predictors.o be consistent with HSM and many other studies, the Poissonamma (Negative Binomial) model was employed to develop thePF.

.2. Full Bayesian methods

Unlike the EB method, the FB approach provides flexible andirect means to fully accommodate and measure uncertaintiesssociated with model prediction of the expected crash number.oisson model is known as the simplest and most common modelor count data regression analysis and has been widely used in crashrequency modeling studies (Jovanis and Chang, 1986; Joshua andarber, 1990; Miaou and Lum, 1993). However, empirical analysishows that vehicle crash data do not satisfy this feature, typicallyaving a larger variance relative to its mean, a phenomenon knowns over-dispersion (Lord and Mannering, 2010). In this situation, theommon Poisson regression model is inappropriate as it can resultn biased and inconsistent parameter estimates. The Poisson Log-ormal (PLN) distribution, a more flexible probability distribution

or counts has become popular in modeling crash count data in theast few years due to its flexibility of accounting for over-dispersionMa et al., 2008; Park and Lord, 2007).

The PLN regression model can be derived from the Poissonodel by assuming that there is an observation specific error term

it, which follows the normal distribution. The framework of theLN regression model can be expressed as Eq. (6).

yit∼Poisson(�it)

�it = Exp(x′it

+ εit) = �it ∗ Exp(εit)

εit∼Normal(0.0, ı2)

(6)

here yit is the observed crash frequency of site i in time period t.it is the mean predicted crash frequency for site i in time period

. xit is a vector of independent variables such as the AADT andegment length, and is a vector of the coefficients for each inde-endent variable and the intercept term. ı2 is the variance of theormal distribution for εit. According to Eq. (6), the error term εitaries across road sites and over time. Apparently, the PLN modelan account for the over-dispersion problem in the regular Poissonodel.However, it is expected that the same road segment shares iden-

ical unobserved features over years. This is the so-called temporalorrelation. In order to account for this level of correlation, a sitepecific random effect is added to the Poisson model to form a ran-om effect Poisson model. For simplicity, this model is denoted ashe Poisson temporal random effect (PTRE) model, which can bexpressed as

yit∼Poisson(�it)

�it = Exp(x′it

+ εi) = �it ∗ Exp(εi)

εi∼Normal(0.0, ı2)

(7)

Unlike the PLN model, here the εi is a site specific random effect,hich does not vary over years for each specific site. This quantity

s assumed to follow normal distribution with mean 0 and variance2.

Moreover, there is also possible correlation among road seg-ents in the same corridor, which is known as the spatial

orrelation. The spatial correlation exists because road segmentsithin one corridor share some common features that are not

ncluded in the prediction models, such as the driver population,eather and light condition, transportation facilities and pavement

d Prevention 63 (2014) 74– 82

quality. Hence, it is desirable to add a corridor specific randomeffect into the PTRE model. In this paper, for simplicity, the authorsassume that the spatial correlation is firm for each two segments inthe same corridor, i.e., the spatial correlation does not change withthe distances between segments. The random effect model withboth temporal and spatial correlations is denoted by PTSRE (Poissontemporal-spatial random effect model), which can be expressed as:

⎧⎪⎪⎪⎪⎨⎪⎪⎪⎪⎩

yijt∼Poisson(�ijt)

�ijt = Exp(x′ijt

+ εi + εj) = �ijt ∗ Exp(εi) ∗ Exp(εj)

εi∼Normal(0.0, ı21)

εj∼Normal(0.0, ı22)

(8)

where yijt is the observed crash frequency of site i in corridor jduring time period t. �ijt is the mean predicted crash frequency forsite i in corridor j during time period t. εi is a site specific temporalrandom effect, and εj is a corridor specific spatial random effect. ı2

1and ı2

2 are the variance for εi and εj.By definition, the PSI of the PLN, PTRE and PTSRE models can be

obtained as follows:PLN model:

PSI = ex′it

ˇ ∗ (eεit − 1) (9)

PTRE model:

PSI = ex′it

ˇ ∗ (eεi − 1) (10)

PTSRE model:

PSI = ex′

ijtˇ ∗ (eεi+εj − 1) (11)

These models were estimated with the full Bayesian tech-niques using the open source software WinBUGS® 3.0.2 (WindowsBayesian Inference Using Gibbs Sampling). When fitting these mod-els, 20,000 iterations were discarded as burn-in, and the 10,000iterations that followed were used to obtain summary statistics ofthe posterior inference. Convergence was assessed by visual inspec-tion of the Markov chains for the parameters. Furthermore, thenumber of iterations was selected so that the Monte Carlo errorwould be less than 0.05 for each parameter.

The performance of these models was evaluated to identify themodel that provides the best fit for the data. The model compari-son was conducted using the Deviance Information Criterion (DIC),which was introduced by Spiegelhalter et al. (2003). The DIC valueadjusts for both the fitting and the complexity of the model. Themodel with the smallest DIC is estimated to be the model that wouldbest predict a replicated dataset of the same structure as that cur-rently observed. A difference of 5 or greater between DIC values forcompeting models can be deemed to be significant.

While DIC was employed for model selection, it is also necessaryto evaluate the efficiency of these models. To assess how well theselected model fits the data, root mean square error (RMSE) for eachmodel was employed. The RMSE can be derived as√

1n

∑(ypred

it− yobs

it)2

(12)

where ypredit

represents the predicted crash count of segment i forthe year t. yobs

itis the observed crash count of segment i for the year

t. n is total number of observations. Smaller RMSE value indicatesbetter data fit.

2.3. HSID performance evaluation

Very few studies have attempted to develop criteria to eval-uate the performance of different HSID methods. Many previous

Page 4: Application of Poisson random effect models for highway network screening

ysis an

sto2hiFtCR(ccoatH

aotfsaperc

S

wfsh

Hotcts

S

tptehpmiaeTw

T

atq

X. Jiang et al. / Accident Anal

tudies have employed the percentage of false positives (FP) andhe percentage of false negatives (FN) to assess the performancef HSID methods (Higle and Hecht, 1989; Washington and Cheng,005). One major disadvantage of these criteria is that the “true”azardous and safe sites should be known, which is not possible

n most situations. Due to the limitation of the traditional FP andN criteria, Cheng and Washington (2008) developed a series ofests to assess the performance of HSID methods. They are the Siteonsistency Test (SCT), the Method Consistency Test (MCT), Totalank Difference Test (TRDT) and the Poisson Mean Difference TestPMDT), where the PMDT method applies only if the “true” meanrash frequency was known. Montella (2010) conducted a study toompare the performance of traditional network screening meth-ds. There the author developed a Total Score Test (TST), which is

weighted combination of the SCT, MCT and TRDT methods. Theseests represent the most advanced methods in assessing differentSID methods.

The Site Consistency Test (SCT) method is used to measure thebility of a method to identify consistently a site as the hotspotver subsequent observation periods. This test rests on the premisehat a true hotspot in period 1 should also have poor safety per-ormance in period 2 given that the crash determinants were notignificantly changed. The original SCT method proposed by Chengnd Washington (2008) computes the sum of crash counts in timeeriod i + 1 for a certain number of hotspots that were identified byach method in time period i. The higher the SCT score, the supe-ior the hotspot identification (HSID) method is. The original SCTriteria can be expressed as

CTj =∑n

k=1Ck(i),method=j,i+1 (13)

here Ck(i),method=j,i+1 represents the crash count in time period i + 1or a site that ranked k in time period i as identified by networkcreening method j, wherein the smaller number of k indicates theigher crash risk of a site.

The original SCT method is reasonable for crash count basedSID methods, but it is not efficient for the methods that are basedn the potential for safety improvement (PSI) values. It is becausehat the road sites of higher PSI do not necessarily have higher crashount. Due to this limitation, the authors modified the original SCTest to accommodate for the PSI based HSID methods. The new SCTcore can be computed from

CTj =∑n

k=1PSIk(i),method=j,i+1 (14)

The SCT method is suitable for the comparison of HSID methodshat provide similar estimate of PSI. However, when one methodroduce relatively lower estimate of PSI in general, it is more likelyo have smaller SCT. In this case, the smaller SCT does not nec-ssarily mean that this method is poor in consistently identifyingotspots. In light of this issue, the authors proposed a total safetyerformance measure difference test, simplified as total perfor-ance difference test (TPDT). The TPDT assumes that the hotspots

dentified by method j with all years of crash data are true haz-rdous sites. For the top k true hotspots, the difference of PSIstimated in time period i and i + 1 was computed. The score of thePDT test is the sum of the differences for the top k true hotspots,hich is shown as

PDTj =∑n

k=1(PSIk(i),method=j,i+1 − PSIk(i),method=j,i) (15)

The method consistency test (MCT) method was designed tossess a method’s performance by measuring the number of siteshat were identified as hotspots in both time period i and subse-uent time period i + 1. This test relies on the same premise as the

d Prevention 63 (2014) 74– 82 77

SCT test. The greater the MCT score, the more reliable and consistentthe method is. MCT test can be expressed as

MCTj = {k1, k2, . . ., kn}i ∩ {k1, k2, . . ., kn}i+1 (16)

The total rank difference test (TRDT) method, as the nameimplies, measures the difference of rankings for hotspots identi-fied in two successive time periods. The smaller the TRDT score,the more reliable and consistent the method is. TRDT test can beexpressed as

TRDTj =∑n

k=1

∣∣R(kj,i) − R(kj,i+1)∣∣ (17)

where R(kj,i) is the rank of site k in time period i identified by theHSID method j.

The abovementioned measures examine various features of dif-ferent hotspot identification (HSID) methods. In order to integratethese methods and provide a synthetic index, a Total Score Test(TST) method was proposed by Montella (2010). The original TSTtest was modified to account for the total performance differencetest (TPDT) that was proposed in this paper. The new TST score isgiven as:

TSTj = 1004

∗[

SCTj

max SCT+

(1 − TPDTj − min TPDT

max TPDT

)+ MCTj

max MCT

+(

1 − TRDTj − min TRDTmax TRDT

)](18)

The TST test assigns the same weight to the SCT, MCT, TPDT andTRDT tests. If method j performs the best in all the four tests, thenthe TST score is 100.

3. Data preparation

Urban 4 lane divided arterials from Central Florida area (Orange,Seminole and Osceola counties) were selected for this study.Crashes that occurred from 2006 to 2009 were obtained for thisanalysis. Road inventory data and traffic information were collectedfrom FDOT road characteristics inventory (RCI) system.

The collected urban 4 lane divided arterials were further sep-arated into homogeneous segments. The segmentation was basedon median type, median width, inside shoulder type and width,outside should type and width, as well as the traffic volume. There-fore, the obtained road segments are deemed as homogeneous withrespect to these variables. Intersection influence area (250 feet inboth sides of intersections) was removed from this dataset. As aresult, a total of 664 urban 4 lane divided arterial road segmentswere obtained.

Crash data was acquired from FDOT crash analysis reporting(CAR) system. Two forms of crash report are used in the State ofFlorida, short form and long form crash reports. A long form isused when the following criteria are met: (1) Death or personalinjury; (2) leaving the scene involving damage to attended vehi-cles or property, and (3) driving while under the influence. Also thepolice officer can complete a long form for a property damage only(PDO) crash at his discretion. Whereas a short form is used to reportother types of PDO traffic crashes. Long form crash data and shortform crash data can be obtained from the CAR system and the SignalFour Analytics (S4A) system separately. However, the S4A systemis still under development and only has complete short form crashrecords in 2008 and 2009 for the study area, which is not sufficientfor the current research. Thus, the authors adopted only long form

crash data from 2006 to 2009 in this study. Apparently, the num-bers of total crashes are not valid due to the missing of PDO crashesreported by short forms, which may bias the ranking of hazardoussites. Hence, the current research focuses on FI crashes only. As a
Page 5: Application of Poisson random effect models for highway network screening

78 X. Jiang et al. / Accident Analysis and Prevention 63 (2014) 74– 82

Table 1Descriptive statistics of collected data.

Variables Description Year Min Max Mean Std

Count Crash frequency 2006 0 40 3.47 6.532007 0 50 3.54 6.722008 0 52 3.21 6.062009 0 52 3.18 6.35

AADT Annual average dailytraffic

2006 4500 116,000 31,281.75 15,591.202007 2800 118,000 31,281.75 15,579.312008 2800 118,000 30,935.84 16,653.84

20

S

rdp

4

4

GelTPPtwwnM

spsibi

TD

2009

Length Segment length –

td: standard deviation.

esult, a total of 8895 FI crashes were employed for this study. Theescriptive statistics of the final data employed in this study areresented in Table 1.

. Results

.1. Modeling results

Overall four models were fitted in this section. The Poissonamma (PG) model was fitted with the mean crash frequency ofach road site in 4 years (2006–2009) as the target variable, and theogarithm of the mean AADT and segment length as the predictors.hus, the total number of observation for the PG model is 664. Theoisson Log-Normal (PLN), Poisson temporal random effect (PTRE),oisson temporal - spatial random effect (PTSRE) models were fit-ed with the original panel data for 4 years separately. In otherords, the crash and traffic record for each road site in each yearas treated as one observation in these models. Hence, the totalumbers of observations for these three models are 2656 (664*4).odeling results of these models are presented in Table 2.Both the deviance information criterion (DIC) and root mean

quare error (RMSE) values indicate that the PTSRE model out-erforms the PTRE model in fitting the crash data, and both are

uperior to the PLN model. Note that the DIC value of the PG models not comparable to others, because it is based on a smaller num-er of observations. Additionally, the intercept of the PTSRE model

s smaller than that of other models, which again implies that the

able 2escription of modeling results.

Variable Mean SD 2.5% 97.5%

Poisson-Gamma (PG)Intercept −8.09 0.29 −8.68 −7.66logAADT 0.85 0.03 0.80 0.91seglng 0.52 0.08 0.37 0.67k 2.04 0.16 1.74 2.37

DIC = 1781.59 RMSE = 7.35Poisson Log-Normal (PLN)Intercept −19.86 0.63 −20.92 −18.82logAADT 1.88 0.06 1.79 1.99seglng 0.39 0.02 0.35 0.42Delta (ı2) 2.58 0.08 2.35 2.86

DIC = 8320.11 RMSE = 2.0Poisson with temporal random effect (PTRE)Intercept −5.70 0.34 −6.50 −5.22logAADT 0.47 0.03 0.42 0.54seglng 0.54 0.12 0.28 0.75Delta (ı2) 4.31 0.31 3.67 5.12

DIC = 7195.13 RMSE = 1.98Poisson with both temporal and spatial random effect (PTSRE)Intercept −4.799 0.45 −5.761 −4.097logAADT 0.22 0.05 0.11 0.29seglng 0.75 0.07 0.61 0.87Delta1 (ı2

1) 0.77 0.06 0.65 0.93Delta2 (ı2

2) 8.51 0.99 6.01 12.69DIC = 6969.91 RMSE = 1.83

700 112,000 31,805.34 15,757.05.10 6.28 0.66 0.80

PTSRE model is superior to others. Table 2 shows that the parameterestimates of the PTRE and PTSRE models are significantly differentfrom those of the PG and the PLN models. ı2 implies the varianceof the error terms in corresponding models. It is observed that ı2 inthe PTRE model is greater than that in the PLN model, which indi-cates that the variance of the crash counts by site is greater thanthat by crash observation. In the PTSRE model, ı2

1 and ı22 represents

the variance of the site specific error term and the corridor specificerror term. It is seen that the variation among corridors representthe majority variance among crash counts.

4.2. Network screening

In order to compare the performance of hotspot identifica-tion (HSID) methods, the mean potential for safety improvement(PSI) for all four years and for two time periods (2006–2007 and2008–2009) was computed for each road segment. The ranking ofhotspots was based on the PSI values: the top 1 hotspot has thehighest PSI value, and vice versa. The top 10 hotspots identified byeach method were presented in Table 3.

The performance of these methods in identifying hotspots wasinvestigated by conducting a series of tests. The results of these testsare presented in Table 4. To comply with the five percent criteria,the values of each test for the top 10, top 20 and top 35 hotspotswere provided.

Table 4 shows that the PTSRE method is superior to othermethods for almost all levels of ranking. Specifically, the site consis-tency tests (SCT) indicate that the hotspots identified by the PTSREmethod in time period 1 (2006–2007) also present very high crashrisk in time period 2 (2008–2009) in terms of the total PSI. The PTREmethod is slightly worse than the PTSRE method, while both aremuch better than the EB and PLN methods. The total performancedifference tests (TPDT) present almost the same performance ofeach method as the SCT tests. However, according to the TPDT tests,the EB method produces slightly lower PSI difference than the PLNmodel. The inconsistency of the SCT tests and the TPDT tests forthe EB and PLN methods may be attributed to the overall lower PSIvalues estimated by the EB method. The method consistency tests(MCT) show that the PTSRE method is almost perfect in identifyinghotspots: the top 10 and top 35 hotspots exactly match in two timeperiods, and only 2 do not match in the top 20 hotspots. On thecontrary, only 8, 14 and 27 out of the top 10, 20 and 35 hotspots areconsistent in two time periods for the PLN method. The total rankdifference tests (TRDT) again prove that the PTSRE method outper-forms all other methods, while the EB and PLN methods performsalmost equally poor. For example, the rank difference of top 20sites is 15 in the PTSRE method, but it is 118 in the EB method.In order to measure the overall performance of each method, the

total score tests (TST) were conducted. The TST tests show that thePTSRE method works slightly better than the PTRE method for allranking levels, and both methods are way better than the PLN andthe EB methods.
Page 6: Application of Poisson random effect models for highway network screening

X. Jiang et al. / Accident Analysis and Prevention 63 (2014) 74– 82 79

Table 3The ranking of top 10 hotspots by different methods.

EB PLN

06–09 06–07 08–09 06–09 06–07 08–09

ID PSI ID PSI ID PSI ID PSI ID PSI ID PSI

208 38.47 208 36.05 208 39.07 208 41.24 208 39.68 212 42.89198 29.26 198 35.33 212 35.83 212 35.93 198 38.98 208 42.81212 29.12 267 27.38 122 23.85 198 31.83 267 31.26 122 26.93267 23.22 212 21.80 210 22.09 267 25.99 212 28.98 198 24.68122 22.89 222 21.62 198 21.71 122 25.38 222 24.85 210 24.64112 20.92 112 20.92 110 20.37 112 23.34 112 23.88 110 23.31110 20.78 122 20.70 207 19.94 222 23.26 122 23.84 112 22.80222 20.75 110 20.06 112 19.80 110 23.11 110 22.92 207 22.78210 20.45 203 19.65 222 18.75 210 22.16 203 22.35 222 21.66203 19.50 204 18.60 203 18.37 203 21.66 75 22.04 203 20.96PTRE PTSRE208 43.33 208 43.83 208 42.83 208 43.91 208 43.66 208 44.15212 38.52 212 39.11 212 37.92 212 40.60 212 40.36 212 40.84198 33.64 198 33.99 372 33.58 372 39.69 372 39.38 372 39.99372 33.04 372 32.50 198 33.29 198 33.97 198 33.99 198 33.95267 27.04 267 28.64 122 26.29 267 27.97 267 28.40 122 27.60122 26.86 122 27.44 267 25.45 122 27.32 122 27.04 267 27.53112 24.86 222 25.30 112 25.15 369 26.21 369 26.90 367 26.00110 24.65 112 24.57 110 24.94

222 24.56 110 24.37 222 23.81

203 23.80 210 24.05 203 23.60

Table 4The performance evaluation of different hotspot identification methods.

Criteria Rank EB PLN PTRE PTSRE

SCT TOP10 231.42 249.51 296.56 316.59TOP20 362.09 416.69 506.46 543.32TOP35 511.48 624.09 748.29 809.26

TPDT TOP10 53.64 55.96 11.36 5.25TOP20 79.44 91.07 17.73 7.21TOP35 142.78 153.31 23.93 13.58

MCT TOP10 8 8 9 10TOP20 15 14 19 18TOP35 26 27 33 35

TRDT TOP10 35 36 9 6TOP20 118 112 25 15TOP35 365 338 40 42

TST TOP10 46.52 46.22 91.11 100.00

5

inti

tB3ph

are interested in the influence of the temporal and spatial correla-

TM

TOP20 44.75 44.02 93.30 98.68TOP35 41.04 45.37 95.00 99.86

. Discussion

According to the network screening results for all 4 years, its observed that the hotspots identified by different methods areot consistent. The numbers of hotspots that have been consis-ently identified as hotspots in each two methods are summarizedn Table 5.

It can be observed that the numbers of sites that were consis-ently identified as hotspots by the Empirical Bayesian (EB) and fullayesian Poisson Log-Normal (PLN) methods in the top 10, 20 and

5 levels are 10, 19 and 34, which means that these two methodsroduce almost the same hotspots. On the contrary, the consistentotspots between PLN and Poisson temporal - spatial random effect

able 5atrix of consistent hotspots by different methods.

TOP10 TOP20

EB PLN PTRE PTSRE EB PLN

EB – 10 9 7 – 19

PLN 10 – 9 7 19 –

PTRE 9 9 – 8 18 18

PTSRE 7 7 8 – 15 15

367 26.21 367 26.42 112 25.75112 25.68 112 25.62 369 25.52110 25.42 110 25.55 110 25.28

(PTSRE) methods are 7, 15 and 30, respectively. This means that 3out of the top 10 hotspots in these two methods do not match.

Tables 6 and 7 present the observed crash counts over years andthe corresponding potential for safety improvement (PSI) estimatesby different methods for the top 10 hotspots identified by the PTSREand the PLN methods.

Table 6 shows that most of the top 10 hotspots identified by thePTSRE method have similar PSI estimates across different methods,except the site number 372, 369 and 367 (shaded gray in Table 6).On the other hand, the top10 hotspots identified by the PLN methodhave consistent PSI estimates in different methods (Table 7). It isnoted that PSI less than 0 was treated as 0, because a minus PSI doesnot have a practical meaning. To further investigate the genera-tion of the difference between the PTSRE and the PLN methods, theauthors extracted 10 sites that have the largest absolute differenceof estimated PSI values. Fig. 1 presents the crash counts, predictedcrash frequency and expected crash frequency of the PTSRE andPLN methods. It is found that the expected crash frequencies esti-mated by both methods are almost the same for the selected sites.However, the predicted crash counts significantly vary in thesemethods. The PLN method always produces higher predicted crashcounts for these sites. Thus, it is clear that the difference in thePSI values is primarily due to the variation in the predicted crashfrequencies.

The rationale of using the EB and FB methods is that they canborrow information from the reference population, and shrink theobserved mean to the “real” mean. In the current paper, the authors

tion on the shrinkage. Fig. 2 presents the observed crash counts andthe PSI of 10 sites that exhibit differently in the PLN and the Poissontemporal random effect (PTRE) methods.

TOP35

PTRE PTSRE EB PLN PTRE PTSRE

18 15 – 34 31 3018 15 34 – 31 30

– 17 31 31 – 3417 – 30 30 34 –

Page 7: Application of Poisson random effect models for highway network screening

80 X. Jiang et al. / Accident Analysis and Prevention 63 (2014) 74– 82

Fig. 1. The predicted and expected crash frequency of typical hotspots by the PLN and PTSRE methods.

Table 6PSI estimates of the top 10 hotspots identified by the PTSRE method.

Site Count PSI

2006 2007 2008 2009 EB PLN PTRE PTSRE

208 40 46 40 52 38.47 41.24 43.33 43.91212 37 32 52 44 29.12 35.93 38.52 40.60372 40 49 37 50 0.00 8.73 33.04 39.69198 34 50 22 33 29.26 31.83 33.64 33.97267 35 33 23 23 23.22 25.99 27.04 27.97122 27 26 27 32 22.89 25.38 26.86 27.32369 38 48 37 32 0.00 0.00 10.37 26.21367 38 36 39 22 0.00 0.00 17.24 26.21

cmmTae

Table 7PSI estimates of the top 10 hotspots identified by the PLN method.

Site Count PSI

2006 2007 2008 2009 EB PLN PTRE PTSRE

208 40 46 40 52 38.47 41.24 43.33 43.91212 37 32 52 44 29.12 35.93 38.52 40.60372 34 50 22 33 29.26 31.83 33.64 33.97198 35 33 23 23 23.22 25.99 27.04 27.97267 27 26 27 32 22.89 25.38 26.86 27.32122 26 27 25 26 20.92 23.34 24.86 25.68369 31 24 18 30 20.75 23.26 24.56 24.93367 25 26 20 32 20.78 23.11 24.65 25.42

112 26 27 25 26 20.92 23.34 24.86 25.68110 25 26 20 32 20.78 23.11 24.65 25.42

Fig. 2 shows that the PSI values estimated by the PTRE model areonsistent in two time periods, while those estimated by the PLNodel significantly vary over time. This is expected because the PLNodel treats each site in each year as an independent observation.

herefore, the information from the same site in different yearsnd the information from other sites contribute the same to thestimation of the PSI for each site. On the contrary, the PTRE model

Fig. 2. PSI values estimated by the PLN a

112 28 16 28 26 20.45 22.16 23.68 24.09110 26 25 25 23 19.50 21.66 23.80 24.35

includes a site specific error term, which shrinks the observed meanof each time period to the “real” mean over years. Hence, given theassumption that no change was applied on the roads, the modelthat accounts for the temporal correlation is more trustable.

In order to investigate the effects of the spatial error term, thePSI values estimated by the PTSRE and the PTRE models for siteswithin one specific corridor is presented in Fig. 3.

nd PTRE methods for typical sites.

Page 8: Application of Poisson random effect models for highway network screening

X. Jiang et al. / Accident Analysis and Prevention 63 (2014) 74– 82 81

n one

faempst

6

tmrBafri

tPw(t(pP

mmamtTcmmmth

f

Fig. 3. PSI estimates for typical sites withi

Fig. 3 shows that the estimated PSI values vary by a great amountor site 367, 369 and 372. It is seen that these three sites have rel-tively high observed crash counts. Both models provide almostqually high PSI estimates for the site 372. However, the PTREodel produces lower PSI values for the site 367 and 369 as com-

ared to the PTSRE model. This implies that the inclusion of thepatial error term leads to a more consistent PSI estimates for siteshat have similar observed crash counts.

. Conclusions

Previous research has adopted many different methods to iden-ify crash risk hotspots. As the development of statistics tools,

ore and more scholars start to use full Bayesian in traffic safetyesearch. However, very few of them have attempted to employ fullayesian models for the hotspot identification. The current paperccounts for the potential temporal and spatial correlations in theull Bayesian models to identify hotspots. Urban 4 lane divided arte-ials in the Central Florida area were investigated. The fatal andnjury crash data from 2006 to 2009 were employed for this study.

A total of four methods were evaluated in this paper. They arehe Poisson-Gamma (PG) model based EB method, the full Bayesianoisson Log-Normal model (PLN), the full Bayesian Poisson modelith a random effect term to account for the temporal correlation

PTRE), the full Bayesian Poisson model with two random effecterms to account for both the temporal and spatial correlationsPTSRE). The modeling results indicate that the PTSRE model out-erforms the PTRE model, and both are significantly superior to theLN and PG models in fitting the panel crash data.

Five criteria were employed to compare the efficiency of theseethods in identifying hotspots. The results show that the PTSREethod performs slightly better than the PTRE method, and both

re way better than the PLN and EB methods. Specifically, the PTSREethod can consistently identify most of the hotspots in successive

ime periods, while the PLN and EB methods are relatively poor.he results indicate that the inclusion of the temporal and spatialorrelation terms in network screening is important given no treat-ent was applied during the study period. It is noteworthy that theethod evaluation was based on the assumption that crash deter-inants in selected segments were not significantly changed over

he studied time periods. If the assumption does not hold, then theotspots in different time periods are not comparable.

The results suggest that the employment of the PTSRE modelor network screening is a good alternative for traffic safety

corridor by the PTRE and PTSRE methods.

management agencies. Moreover, the potential for safety improve-ment (PSI) based site consistency tests (SCT) and the total safetyperformance measure difference tests (TPDT) proposed in thispaper can be employed to compare the performance of differenthotspot identification (HSID) methods.

Due to the space limitation, the current paper conducted thenetwork screening based on the PSI of each road segment regard-less of its length. In other words, the hotspots identified in thispaper may not reflect the cost-effectiveness of treatment for eachsite. Further studies to compare these methods accounting for thecost-effectiveness of treatments are desirable. Methodologically,the current paper assumes that the spatial correlation for siteswithin one corridor is independent of distance, and the tempo-ral correlation of sites over years is independent of time gap. Thefuture study would need to account for the distance based spatialcorrelation and time series based temporal correlation.

Acknowledgement

The authors would like to thank the Florida Department ofTransportation (FDOT) for funding this study.

References

Aguero-Valverde, J., Jovanis, P.P., 2009. Bayesian multivariate Poisson log-normalmodels for crash severity modeling and site ranking. In: Presented at the 88thAnnual Meeting of the Transportation Research Board.

Alluri, P., 2008. Assessment of potential site selection methods for use in prioritizingsafety improvements on Georgia roadways. M. Sc. Thesis. Clemson University.

Caliendo, C., Guida, M., Parisi, A., 2007. A crash-prediction model for multilane roads.Accident Analysis and Prevention 39, 657–670.

Carlin, B.P., Louis, T.A., 2000. Bayes and Empirical Bayes Methods for Data Analysis,Monographs on Statistics and Applied Probability, 69. Chapman & Hall, London.

Cheng, W., Washington, S., 2008. New criteria for evaluating methods of identifyinghot spots. Transportation Research Record 2083, 76–85.

Chin, H.C., Quddus, M.A., 2003. Applying the random effect negative binomial modelto examine traffic accident occurrence at signalized intersections. Accident Anal-ysis and Prevention 35, 253–259.

Greene, W., 2000. Econometric Analysis, 4th ed. Prentice Hall, New Jersey.Hauer, E., 1997. Observational Before–After Studies in Road Safety. Pergamon Press,

Tarrytown, N.Y.Higle, J.L., Hecht, M.B., 1989. A Comparison of Techniques for the Identification of

Hazardous Locations. In Transportation Research Record 1238, TRB. NationalResearch Council, Washington, D.C., pp. 10–19.

Hoff, P.D., 2009. A First Course in Bayesian Statistical Methods. Springe DordrechtHeidelberg, London, New York, pp. 45–48.

Huang, H.L., Chin, H.C., Haque, M.M., 2009. Hotspot identification: a full Bayesianhierarchical modeling approach. Transportation and Traffic Theory, 441–462http://link.springer.com/chapter/10.1007/978-1-4419-0820-9 22

Page 9: Application of Poisson random effect models for highway network screening

8 ysis an

J

J

J

L

M

M

M

M

M

2 X. Jiang et al. / Accident Anal

iang, X.M., Huang, B.S., Zaretzki, R., Richards, S., Yan, X.D., 2013. Estimat-ing safety effects of pavement management factors utilizing Bayesianrandom effect models. Traffic Injury Prevention, http://dx.doi.org/10.1080/15389588.2012.756582.

oshua, S.C., Garber, N.J., 1990. Estimating truck accident rate and involvements usinglinear and Poisson regression models. Transportation Planning and Technology15, 41–58.

ovanis, P.P., Chang, H.L., 1986. Modeling the relationship of accidents to miles trav-eled. Transportation Research Record 1068, 42–51.

ord, D., Mannering, F.L., 2010. The statistical analysis of crash-frequency data: areview and assessment of methodological alternatives. Transportation ResearchPart A 44 (5), 291–305.

a, J.M., Kockelman, K.M., Damien, P., 2008. A multivariate Poisson-lognormalregression model for prediction of crash counts by severity, using Bayesianmethods. Accident Analysis and Prevention 40, 964–975.

aher, M.J., Mountain, L.J., 1988. The identification of accident blackspots: a com-parison of current methods. Accident Analysis and Prevention 20 (2), 143–151.

cGuigan, D.R.D., 1981. The use of relationships between road accidents and traf-fic flow in “blackspot” identification. Traffic Engineering and Control 22 (8/9),

448–451, 453.

cGuigan, D.R.D., 1982. Non-junction accident rates and their use in “black-spot”identification. Traffic Engineering and Control 23 (2), 60–65.

iaou, S.P., Lum, H., 1993. Modeling vehicle accidents and highway geometric designrelationships. Accident Analysis and Prevention 25, 689–709.

d Prevention 63 (2014) 74– 82

Montella, A., 2010. A comparative analysis of hotspot identification methods. Acci-dent Analysis and Prevention 42, 571–581.

Park, E.-S., Lord, D., 2007. Multivariate Poisson-lognormal models for jointlymodeling crash frequency by severity. Transportation Research Record 2019,1–6.

Persaud, B.N., 1999. Empirical Bayes procedure for ranking sites for safety investiga-tion by potential for safety improvement. Transportation Research Record 1665,7–12.

Shankar, V.N., Albin, R.B., Milton, J.C., Mannering, F.L., 1998. Evaluating mediancrossover likelihoods with clustered accident counts: an empirical inquiry usingthe random effects negative binomial model. Transportation Research Record1635, 44–48.

Spiegelhalter, D.J., Best, N.G., Carlin, B.P., Linde, V.D., 2003. Bayesian measures ofmodel complexity and fit (with discussion). Journal of the Royal Statistical Soci-ety B64 (4), 583–616.

U.S. Department of Transportation, 1998. Transportation equity act for the 21stcentury (TEA-21). Washington, D.C.

U.S. Department of Transportation, 2009. Safe, Accountable, Flexible, Efficient,Transportation Equity Act: A Legacy for Users (SAFETEA-LU). Washington, D.C.

Ulfarsson, G.F., Shankar, V.N., 2003. Accident count model based on multiyear cross-sectional roadway data with serial correlation. Transportation Research Record1840, 193–197.

Washington, S., Cheng, W., December 2005. High Risk Crash Analysis. FHWA-AZ-05-558. Arizona Department of Transportation, Phoenix.