17
Using Temporal Detrending to Observe the Spatial Correlation of Traffic 1 Alireza Ermagun (Corresponding Author) 2 Ph.D. Candidate 3 University of Minnesota, Department of Civil, Environmental, and Geo- Engineering 4 500 Pillsbury Drive SE, Minneapolis, MN 55455 USA 5 [email protected] 6 Snigdhansu Chatterjee 7 Associate Professor 8 University of Minnesota, School of Statistics 9 224 Church Street SE, Minneapolis, MN 55455 USA 10 [email protected] 11 David Levinson 12 Professor 13 University of Minnesota and University of Sydney 14 500 Pillsbury Drive SE, Minneapolis, MN 55455 USA 15 [email protected] 16 Paper submitted for: 17 Presentation at 96 th Annual Transportation Research Board Meeting, January 2017 18 Standing Committee on Freeway Operations (AHB20) 19 4745 words + 5 figures + 1 tables 20 July 31, 2016 21

1 Using Temporal Detrending to Observe the Spatial

  • Upload
    others

  • View
    1

  • Download
    0

Embed Size (px)

Citation preview

Page 1: 1 Using Temporal Detrending to Observe the Spatial

Using Temporal Detrending to Observe the Spatial Correlation of Traffic1

Alireza Ermagun (Corresponding Author)2Ph.D. Candidate3University of Minnesota, Department of Civil, Environmental, and Geo- Engineering4500 Pillsbury Drive SE, Minneapolis, MN 55455 [email protected]

Snigdhansu Chatterjee7Associate Professor8University of Minnesota, School of Statistics9224 Church Street SE, Minneapolis, MN 55455 [email protected]

David Levinson12Professor13University of Minnesota and University of Sydney14500 Pillsbury Drive SE, Minneapolis, MN 55455 [email protected]

Paper submitted for:17Presentation at 96thAnnual Transportation Research Board Meeting, January 201718Standing Committee on Freeway Operations (AHB20)19

4745 words + 5 figures + 1 tables20July 31, 201621

Page 2: 1 Using Temporal Detrending to Observe the Spatial

ABSTRACT1This empirical study sheds light on the correlation of traffic links under different traffic regimes.2We mimic the behavior of real traffic by pinpointing the correlation between 140 freeway traffic3links in a sub-network of the Minneapolis - St. Paul highway system with a grid-like network4topology. This topology enables us to juxtapose positive correlation with negative correlation,5which has been overlooked in short-term traffic forecasting models. To accurately and reliably6measure the correlation between traffic links, we develop an algorithm that eliminates temporal7trends in three dimensions: (1) hourly dimension, (2) weekly dimension, and (3) system dimension8for each link. The correlation of traffic links exhibits a stronger negative correlation in rush hours,9when congestion affects route choice. Although this correlation occurs mostly in parallel links, it10is also observed upstream, where travelers receive information and are able to switch to substitute11paths. Irrespective to the time-of-day and day-of-week, a strong positive correlation is witnessed12between upstream and downstream links. This correlation is stronger in uncongested regimes, as13traffic flow passes through consecutive links more quickly and there is no congestion effect to shift14or stall traffic. The extracted correlation structure can augment the accuracy of short-term traffic15forecasting models.16

Keywords: Highway Network; Traffic Flow; Spatial Correlation; Data Detrending; Traffic17Forecasting18

Page 3: 1 Using Temporal Detrending to Observe the Spatial

Ermagun, Chatterjee, and Levinson 2

INTRODUCTION1The rapid development of technology and availability of large amounts of data enhances the ability2to monitor traffic data over time and space, and eases analyzing the correlation between traffic3links. Traffic analysts have utilized the spatial dependency of road segments to solve three typical4problems in a traffic network: (1) short-term traffic forecasting (1), (2) reliable path problem (2),5and (3) missing data estimation (3), discussed in the next section.6

Irrespective of the problem, studies have revealed the positive spatial correlation between7road segments for two main reasons. First, the topology of studied networks typically consists8of traffic links that are immediately upstream or downstream, and thereby they exhibit positive9correlation in terms of traffic due to the physics of conservation of flow. Second, traffic rises and10falls by time-of-day, day-of-week, and week-of-year across the network, more-or-less independent11of spatial configuration. Failing to extract the exact temporal dependency of traffic characteristics12throughout a network results in neglecting the negative correlation between traffic links.13

The positive correlation stands on traffic flow theory, and more precisely on the time-space14diagram. This positivity derives from vehicles observed upstream at one time slice being observed15downstream in the same or at a later time slice. It is true when we assume changing the road16costs and demands over time has no impact. In reality, however, traffic may shift from one road to17another due to traveler responses to congestion or closure. This may result in negative correlation18between two links which are in series, as well as negative correlation for links in parallel. Although19recent contributions from network science emphasize the necessity for procuring the exact interde-20pendency between road segments, little is known about existing negative and positive correlations21in a complex traffic network.22

We hypothesize that traffic links in a real network exhibit both negative and positive corre-23lations after detrending. This empirical study sheds light on the correlation of traffic links under24different traffic regimes by adopting an in-depth statistical analysis to pinpoint the correlation of25traffic. We contribute to the literature by defining and measuring the correlation of traffic on a26real-word network, and explore their causes. This correlation structure is capable to augment the27accuracy of short-term traffic forecasting models. We restrict our attention to a sub-network of the28Minneapolis - St. Paul freeway system composed of 140 loop detectors with a grid-like network29topology. This topology enables us to juxtapose the negative correlation of competitive segments30with the positive correlation of complementary segments.31

The remainder of the paper is set out as follows. First, we review the literature discussing32the correlation nature of traffic links in road networks. Next, we discuss the data and methodology33used in this study in detail. We proceed to graphically display the empirical correlation of traffic34links, and collate the results in individual traffic regimes. We then conclude the paper by broaching35a number of recommendations for future research.36

PREVIOUS STUDIES37Traffic analysts scrutinize the correlation of traffic links to augment the accuracy of short-term traf-38fic forecasting, reliable path finding, and missing data estimation. The literature discussing these39three branches of research is prolific, and a well-established body of literature reviews the method-40ologies used in these studies. In 2004, Vlahogianni et al. (4) reviewed objectives and methods41used in short-term traffic forecasting. They examined the pros and cons of modeling frameworks42under the umbrella of parametric and non-parametric techniques. In 2014, Vlahogianni et al. (5)43examined the challenges of modeling in short-term traffic forecasting, and concluded there is an44

Page 4: 1 Using Temporal Detrending to Observe the Spatial

Ermagun, Chatterjee, and Levinson 3

uncertainty whether the accuracy of developed complex methods are better than researchers mod-1els developed 30 years ago. More recently, Ermagun and Levinson (6) systematically reviewed2more than 130 papers using spatiotemporal models for traffic forecasting. They emphasized that a3large gulf exists between the spatial dependence of traffic links on a real network and the networks4studied in current literature, and drew attention to these three shortcomings: (1) looking at spatial5dependency of either adjacent or distant upstream and downstream of study link, (2) prejudging the6spatial dependence between traffic links in modeling, and (3) neglecting the negative correlation7between traffic links in modeling.8

One of the main difficulties in the literature is that it is plagued with multifarious complex9forecasting methods, while representing a long but shallow comprehension of spatial dependency10between traffic links. In this part, hence, we dig into the correlation analysis used in the literature,11and emphasize the approach of capturing spatial dependence between traffic links.12

Early researchers used information upstream and downstream of the study link, as there is13a reasonable belief that they are highly and positively correlated with the study link. Okutani and14Stephanedes (7) were the first to utilize the information of adjacent upstream links in predicting15traffic flow in 1984, although they never pointed out the correlation between traffic links. This16approach spread through the literature for two major reasons. First, it was simple. As alluded to17previously, traffic network is a complex system and understanding the detailed interrelationship18between all traffic links requires comprehensive knowledge and large computational efforts. Thus,19considering only the immediate upstream and downstream of the study link eases the calculation.20Second, it was effective. Research typically studied a corridor comprising a small number of21traffic links, which narrows the neighboring links of the study link down to adjacent upstream and22downstream links. Then, it is not surprising to achieve decent results.23

Stathopoulos and Dimitriou (8) used spatial correlation between two loop detectors. Em-24bedding the information of the immediate upstream link, they improved traffic forecasts. Although25Chandra and Al-Deek (9) examined a significant correlation of the study link with both adjacent26and far traffic links, they only utilized the information of immediate upstream and downstream27links.28

Despite the simplicity and effectiveness, this method ignores the effects of other traffic29links, as correlation only between adjacent links was presented in the literature. Li et al. (3), for30instance, narrowed their study area to three consecutive traffic links: “It had been shown that the31correlation degrees among different points decreases significantly with respect to distances. So,32[...] we only consider m = 3 in this paper. That is, only the upstream and downstream neighboring33detecting points are studied.” This approach is incomplete, as it selects only a part of the network34and neglects the correlation between other traffic links.35

More recently studies have emerged to examine the effects of not just adjacent traffic links,36and thereby embed more information to enhance the accuracy of forecasting methods. One class37of studies prejudges the correlation between traffic links in different distance thresholds. This class38is so-called “lth-order neighbors,” where l represents the ring of neighbors. For instance, the first-39order neighbors are those links that adjoin the study link, while the second-order neighbors are40indirectly joined to the study link, having the adjacent links in the middle. Studies falling into this41class assign a similar correlation value to each neighbor. In 2003, for example, Kamarianakis and42Prastacos (10) considered both the first- and second-order neighbors and equally weighted all first-43and second-order neighbors.44

The other class of studies benefits from the correlation coefficient analysis to determine the45

Page 5: 1 Using Temporal Detrending to Observe the Spatial

Ermagun, Chatterjee, and Levinson 4

correlated links with the study link. In 2005, Sun et al. (11) analyzed a grid network comprising131 traffic links in Beijing, China. To capture all spatial and temporal correlation between traffic2links, they adapted Pearson correlation coefficient analysis. The results indicated the traffic flows3of links are positively correlated, and the correlation does not follow any distance pattern.4

Using cross-correlation analysis, Yue and Yeh (12) quantitatively measured the correlation5between seven traffic links in an urban corridor of Kowloon, Hong Kong. They illustrated that6the consecutive traffic links are positively correlated, and this correlation decreases by distance.7They also found a significant drop in the correlation coefficient of one upstream link, which was8justified by the presence of an off-ramp before the upstream link to a large residential area. A recent9study (13) scrutinized the correlation between 3,254 loop detectors installed on the Minneapolis10- St. Paul freeway system. Their analysis underlined that positive correlations exist in hundreds11of sensors distributed on the whole road network sparsely, not just the neighborhood around the12study link. Although they were the first to reveal the sparse correlation between traffic links, they13overlooked the negative correlation nature of traffic links.14

In defiance of various approaches to capture spatial correlation between traffic links, the15literature has come to a longstanding agreement that traffic links are positively correlated.16

We argue, that after properly controlling for temporal demand effects (i.e. by detrending as17described below), network segments are both positively and negatively correlated, as one would ex-18pect from an understanding of spatial network structure which has links in both series and parallel,19and where travelers have choice of route and are sensitive to perceived travel time (14).20

DATA21In 2007, the Minnesota Department of Transportation (MnDOT) developed Intelligent Roadway22Information System (IRIS), an open-source advanced traffic management system to monitor and23manage highway traffic. The system collects and reports traffic flow, speed, occupancy, volume,24density, and headway from 7,246 loop and virtual detectors in 30 seconds increments. Detectors25are located in five distinct places: (1) Mainline Detectors, which collect data from all traffic lanes26of interstates and highways, (2) Entrance ramp detectors, which collect the data of on-ramps, (3)27Exit ramp detectors, which collect the data of off-ramps, (4) Queue ramp detectors at the start of28ramps, and (5) Passage ramp detectors near ramp meters.29

For the purpose of this study, we extracted traffic flow of a major sub-network of the Min-30neapolis - St. Paul freeway system. The sub-network consists of major highways in the western31suburbs, specifically I-494, I-94, I-394, US 169, TH 212, TH 100, and TH 62 for the East-West and32South-North directions. They are the busiest major highways in the Minneapolis - St. Paul high-33way system, particularly TH 62 in south Minneapolis is a notorious hotspot for traffic congestion.34This sample includes 687 detectors, 146 of which are entrance and exit ramps. In road segments,35the number of detectors varies from 1 to 4 depending on the number of lanes. We aggregated the36flow information of all traffic lanes on a road segment, which results in 149 stations. We excluded379 stations and 91 ramps due to lack of data. We collected the traffic flow measurements for all38Tuesdays of 2015 in three distinct times-of-day:39

1. Morning rush hour: From 7:30-8:30 AM40

2. Non-rush hour: From 10:00-11:00 AM41

3. Evening rush hour: From 4:30-5:30 PM42

Page 6: 1 Using Temporal Detrending to Observe the Spatial

Ermagun, Chatterjee, and Levinson 5

We also extracted the same information for all Saturdays of 2015. This trajectory enables1us to compare the variation of competitive and complementary nature of traffic links not only2over congested and uncongested regimes, but also over weekdays and weekends. We smoothed3the traffic flow over 1-minute, which is assumed reasonable for the purpose of this study. This4results in 3,120 observations (52× 60) for each detector for each time-of-day. The missing data5are excluded from the analysis for each detector. We cannot easily graph a 140× 140 matrix for6the analysis purpose, so we select some illustrative examples. Four stations were targeted in a7stratified sampling method. They are stations 719, 340, 933, and 762, which are located in I-494,8I-394, TH 100, and US 169, respectively. The characteristics of traffic flow for these four stations9for all weeks are summarized in Table 1. As shown, the maximum traffic flow belongs to link 34010for Tuesday evening rush hour. The minimum traffic flow was observed on Saturday between 7:3011AM and 8:30 AM in link 719.12

TABLE 1 : Traffic flow characteristics of selected stations over week-of-year

Link Time Average St. Dev. Max Min

719

Tuesday 7:30-8:30 6082.59 849.71 7166.00 3788.00Tuesday 10:00-11:00 3399.45 608.30 4426.00 2138.00Tuesday 16:30-17:30 6227.45 1248.70 8036.00 3273.00Saturday 7:30-8:30 1921.12 880.07 3152.00 1.00Saturday 10:00-11:00 3690.43 1069.95 4826.00 18.00Saturday 16:30-17:30 4223.35 1167.78 5468.00 8.00

340

Tuesday 7:30-8:30 5470.54 411.86 5864.00 3818.00Tuesday 10:00-11:00 2679.08 248.57 3151.00 1942.00Tuesday 16:30-17:30 7768.58 643.83 8634.00 5966.00Saturday 7:30-8:30 1544.63 161.93 1870.00 1266.00Saturday 10:00-11:00 3182.50 202.86 3594.00 2834.00Saturday 16:30-17:30 3826.38 406.47 4673.00 2915.00

933

Tuesday 7:30-8:30 3672.42 299.76 4188.00 2576.00Tuesday 10:00-11:00 2292.58 199.11 2653.00 1714.00Tuesday 16:30-17:30 6687.75 706.48 7425.00 3341.00Saturday 7:30-8:30 1148.41 202.08 1450.00 703.00Saturday 10:00-11:00 2172.04 315.30 2639.00 1379.00Saturday 16:30-17:30 2810.31 406.34 3824.00 1672.00

762

Tuesday 7:30-8:30 4608.48 620.92 5517.00 2281.00Tuesday 10:00-11:00 3314.12 838.73 5839.00 1691.00Tuesday 16:30-17:30 4832.98 764.05 7368.00 3290.00Saturday 7:30-8:30 2017.15 589.23 3527.00 849.00Saturday 10:00-11:00 3602.87 744.23 5058.00 2023.00Saturday 16:30-17:30 3970.42 753.70 5985.00 2157.00

To portray the traffic oscillation during day-of-week and day-of-weekend, we plotted the13traffic flow of the selected links in Figure 2 for February 24th and 28th, 2015. As we expected, the14traffic flow pattern of Tuesday is markedly different from Saturday. On Tuesday, traffic flow has15two major peaks. One is happened in morning between 7:30 and 8:30, and the other is observed for16

Page 7: 1 Using Temporal Detrending to Observe the Spatial

Ermagun, Chatterjee, and Levinson 6

a longer period of time in evening between 15:00 and 18:00. Comparing the traffic flow of morning1rush hour with evening rush hour, we observe evening rush hour is generally more congested than2the mornings due to more personal trips. On Saturday, we witness on major traffic peak, which3starts about 10:00 AM. However, the traffic volume on Saturday is entirely lower than Tuesday.4

METHODOLOGICAL FRAMEWORK5Three-dimensional Data Detrending6Traffic flow exhibits time trends in time-of-day, day-of-week, and week-of-year. These trends are7witnessed not only at the link level, but also at the system level, which is the total system travel by8time-of-day. Eliminating these time variations is fundamental to capture more accurate and reliable9spatial correlation between traffic links. As discussed in the preceding section, we extracted traffic10flow from three different one-hour time threshold in both Tuesdays and Saturdays of 2015. For11the purpose of temporal detrending and observing spatial correlation, we detrend the data in three12dimensions:13

1. Hourly Dimension: In this step, we remove the trend in each one-hour time threshold14from each traffic link. For example, we eliminate the trend from time threshold of 7:30-158:30 AM of the first Tuesday of 2015 from link 719. We repeat this step for all Tuesdays16and for all links.17

2. Weekly Dimension: The hourly detrended data of each traffic link has a weekly trend of1852 weeks of the year. In this step, we eliminate this trend from the data.19

3. System Dimension: Although removing the trend in two aforementioned directions is20prevalent in the traffic literature, this dimension is typically overlooked in the traffic21data analysis. Unlike the previous two dimensions that focus on removing time trend,22this dimension emphasizes on extracting the total system travel by time-of-day. Traffic23flow of each link in a specific time span during a day displays a remarkable correlation24with total flow of all traffic links. Deriving this trend is fundamental to observe the25competitive nature of traffic links.26

In the remainder of this section, we unpack the statistical steps behind the three-dimensional27data detrending. We utilize an algorithm to remove time-of-day and day-of-week trend for each28link, and the total system travel by time-of-day trend.29

Data Detrending Algorithm30Loop detectors may not be functional for different periods of time during any given day owing31to malfunction or other technical issues, and in some cases they may not be functional for longer32stretches of time due to construction work or other longer term issues. There are various ways in33which such lack of data from a detector are indicated.34

After reading the data and parsing it correctly to account for malfunctions, we concentrate35on the specific day of the week and duration of time that is of interest for our analysis. To verify36algorithmic robustness, we tested the algorithm for all days of the week, at various start and end37time points, and different levels of data aggregation. This yields a vector of volume of traffic data38for each traffic link and each day. Consider m the total number of aggregated data points, then the39vector of observations is represented by Y(s, t) = (Ys,t,1, . . . ,Ys,t,m) for each station s and each day40

Page 8: 1 Using Temporal Detrending to Observe the Spatial

Ermagun, Chatterjee, and Levinson 7

0

20

40

60

80

100

0:00 2:24 4:48 7:12 9:36 12:00 14:24 16:48 19:12 21:36 0:00

Flo

w R

ate

(1 m

in)

Time of Day

Traffic Flow of Link 719 Tuesday Saturday

0

20

40

60

80

100

120

0:00 2:24 4:48 7:12 9:36 12:00 14:24 16:48 19:12 21:36 0:00

Flo

w R

ate

(1 m

in)

Time of Day

Traffic Flow of Link 340 Tuesday Saturday

0

10

20

30

40

50

60

70

80

0:00 2:24 4:48 7:12 9:36 12:00 14:24 16:48 19:12 21:36 0:00

Flo

w R

ate

(1 m

in)

Time of Day

Traffic Flow of Link 762 Tuesday Saturday

0

20

40

60

80

100

120

0:00 2:24 4:48 7:12 9:36 12:00 14:24 16:48 19:12 21:36 0:00

Flo

w R

ate

(1 m

in)

Time of Day

Traffic Flow of Link 933 Tuesday Saturday

FIGURE 1 : Traffic Flow of Selected Sections for February 24th and 28th, 2015

Page 9: 1 Using Temporal Detrending to Observe the Spatial

Ermagun, Chatterjee, and Levinson 8

t. The notations s ∈ {s1, . . . ,sS} and t = 1,2, . . . ,T stand for study stations and days, respectively.1In our present data analysis T = 52. We fit a robust location estimator to the data vectors Y(s, t)2for each station s ∈ {s1, . . . ,sS}. This is captured by obtaining the minimizer µ(s) ∈ Rm as per3Equation 1.4

T

∑t=1||Y(s, t)−µ(s)||1, (1)

where || · ||1 is the L1-norm of a vector. This yields the vector of medians for each location.5This step removes the secular trend for each coordinate of the vector obtained from the6

previous step. To this detrended data, we fit an autoregression model of appropriate order, to model7the temporal dependencies between successive time aggregation intervals. This step involves a8model selection, and we select the best available autoregression up to and including lags of order90 to 5. A lag zero model implies no temporal dependency.10

In order to do this, we fit under penalization the following model using the assumption that11for each s and t, the sequence {ε(s, t,k)} is a mean zero, finite variance white noise sequence.12

Y (s, t,k)− µ(s,k) =J

∑j=0

φs, j(Y (s, t,k− j)− µ(s,k− j)

)+ ε(s, t,k), (2)

Figure ?? represents the selected models for Link 719 in different time thresholds.13We assume second order stationarity for the above model fitting. We obtain the residuals14

after this autoregression model fitting. As a result, we derive Equation 315

R(s, t,k) = Y (s, t,k)− µ(s,k)−J

∑j=1

φs, j(Y (s, t,k− j)− µ(s,k− j)

). (3)

Following the aforementioned steps to remove the trend and temporal dependencies, we16embark on steps to obtain the spatial dependency patterns using the R(s, t,k) values. The first step17is to elicit the neighborhood dependency relations. For this, we obtain serial correlations across18each pair of station s1 and s2 for each time t. This results in:19

C(s1,s2, t) =Cor(R(s1, t,k),R(s2, t,k)

). (4)

We construct a robust yearly summary of these by taking the median C1(s1,s2) of20{C(s1,s2,1), . . . ,C(s1,s2,T )}. If C1(s1,s2) is above a threshold c1, we consider the stations s1 and21s2 to be spatially correlated. We adopt c1 = 0.10 for the present study.22

After obtaining and identifying correlation structures in the above manner, we study longer23range of complementary relations between stations. To achieve this, we first compute the propor-24tion of trend and temporal dependency adjusted residuals for each day t and each station s, which25represents the proportion of traffic flowing through station s on day t at each time aggregation step.26Let these proportional residuals be R(s, t,k). We use the same measure of association, namely the27correlation, using these. That is, across each pair of station s1 and s2 for each time t, we obtain:28

C(s1,s2, t) =Cor(R(s1, t,k), R(s2, t,k)

). (5)

Page 10: 1 Using Temporal Detrending to Observe the Spatial

Ermagun, Chatterjee, and Levinson 9

Saturday 10:00-11:00 Saturday 16:30-17:30

Saturday 7:30-8:30 Tuesday 16:30-17:30

Tuesday 10:00-11:00 Tuesday 7:30-8:30

FIGURE 2 : Fitted autoregressive model to Link 719 in different time thresholds

Page 11: 1 Using Temporal Detrending to Observe the Spatial

Ermagun, Chatterjee, and Levinson 10

As in the previous step, we construct a robust yearly summary of these by taking the me-1dian C2(s1,s2) of {C(s1,s2,1), . . . ,C(s1,s2,T )}, and obtain a negative or positive relation between2stations s1 and s2 if C2(s1,s2) < −c2 for a chosen threshold c2. In the present study, we used3c2 = 0.10.4

We have cross checked our computations with other choices of thresholds and other tuning5parameters of our algorithm, and the overall pattern of the results we obtain are quite stable. In the6following section, we depict the extracted correlation for selected links and discuss the correlation7of traffic.8

GRAPHICAL DISCUSSION9After temporal detrending the data, we represent and discuss the results of spatial correlation of10selected traffic links in this section. To give the reader a sense of how the value of correlation11fluctuates between traffic links and time-of-day, we plotted the box and whisker diagram of four12traffic links in Figure 3. Looking at the plots, the dots above and below each box show only a13few number of links are highly correlated with the study link. The positive correlation is stronger14than negative correlation, although they are competitive in the number. In general, the negative15correlation is more prevalence on Tuesday morning and evening rush hours than other time-of-day.16It is justified by the congestion during the rush hour, which brings to light the competitive role of17parallel traffic links in the network. A weak negative correlation is observed during non-rush hour18and weekends, due to the low level of traffic congestion.19

To examine the relationship of negative and positive correlations with the network structure,20and more precisely the parallel and series links, we mapped the correlation results of the selected21links for Tuesday morning rush hour in Figure 4. In this figure, the color spectrum of negative22correlation changes from light pink to violet, and for positive correlation it varies from light blue23to dark blue. The study link is shown by a black star. The correlation magnitude greater than |10.0|24represents a strong significant correlation at the 90% confidence interval.25

Page 12: 1 Using Temporal Detrending to Observe the Spatial

Ermagun, Chatterjee, and Levinson 11

-25

2575

-50

050

100

Cor

rela

tion

(%)

Correlation Analysis of Link 719

Tuesday 7:30-8:30 Tuesday 10:00-11:00Tuesday 16:30-17:30 Saturday 7:30-8:30Saturday 10:00-11:00

-10

1030

50-2

00

2040

60C

orre

latio

n (%

)

Correlation Analysis of Link 340

Tuesday 7:30-8:30 Tuesday 10:00-11:00Tuesday 16:30-17:30 Saturday 7:30-8:30Saturday 10:00-11:00

-10

1030

5070

-20

020

4060

80C

orre

latio

n (%

)

Correlation Analysis of Link 933

Tuesday 7:30-8:30 Tuesday 10:00-11:00Tuesday 16:30-17:30 Saturday 7:30-8:30Saturday 10:00-11:00

-25

2575

-50

050

100

Cor

rela

tion

(%)

Correlation Analysis of Link 762

Tuesday 7:30-8:30 Tuesday 10:00-11:00Tuesday 16:30-17:30 Saturday 7:30-8:30Saturday 10:00-11:00

FIGURE 3 : Statistical Correlation Analysis for Selected Sections

The correlation results of station 719 highlight that, after detrending, there is a network1structure effect. Both negative and positive correlation exist between flows at this station and2others. This correlation ranges from -48.3 to 70.0 for station 719. A strong positive correlation3belongs to the immediate upstream and downstream links of station 719. It is in line with our4hypothesis and previous studies. The strength of positive correlation declines with distance. The5positive correlation stretch upstream turns negative at station 515, which is located before an off-6ramp. We posit traffic congestion propagation on station 719 results in some upstream traffic7switching to a substitute path, and thereby more traffic on station 719 reduces traffic upstream as8travelers seek substitutes. A strong stretch of negative correlation is also observed in the links9parallel to station 719. This supports our hypothesis about competitive links. US 169 and TH 10010are two main competitive paths for I-494. Thereupon, it is not surprising that traffic flow passes11through the substitute paths, when traffic congestion has a strong effect on the network.12

Likewise, there is a strong positive correlation between station 340 and its immediate up-13stream and downstream. This correlation is weakened by distance station station 340 and is trans-14formed into the negative correlation upstream. There is a strong negative correlation between15station 340 and its competitive links in TH 62 and I-494. Looking at the correlation analysis of16station 933, we observe a strong positive correlation between station 933 and its two immediate17upstream links, but not its downstream link. As shown, the downstream station 935 stands in a18

Page 13: 1 Using Temporal Detrending to Observe the Spatial

Ermagun, Chatterjee, and Levinson 12

^

I-494

T.H.212

U.S.1

69

I-394

T.H.62

I-94

T.H.10

0I-494

S143S236S128S234S129S232S230S226S224

S220S218S213

S208

S318S319S320S336S337S338S339S340S341S342S345S346S347

S350S351S352S353

S354S355S356S357S358S366S368S369

S310S308

S307S306

S425S424

S756

S758S221S759S760S761

S762S763S764

S755S765

S766

S767S768

S769S770

S442

S441S440S439

S438S437

S435S434S433

S432

S431

S430

S429

S428S427

S935

S933S932S931

S930S406S395

S394S393

S392S391

S389S388

S387

S384S383S382

S381S380

S379

S378S377S376

S375S188S480

S191S482 S481S483

S485S486

S487

S488

S511

S512

S513

S514

S515

S516S517S518

S718S719S720S721

S723

S724

S726

S728

S729

S731S214

S231

S210

S1769

S1767

S1766

S1614S1613

S1011S1009

Correlation Analysis of Link 719

Correlation (%)-48.3 - -30.0-29.9 - -20.0-19.9 - -10.0-9.9 - 0.000.1 - 10.010.1 - 20.020.1 - 30.030.1 - 70.0

^ Study Link

±

^

I-494

T.H.212

U.S.1

69

I-394

T.H.62

I-94

T.H.10

0I-494

S143S236S128S234S129S232S230S226S224

S220S218S213

S208

S318S319S320S336S337S338S339S340S341S342S345S346S347

S350S351S352S353

S354S355S356S357S358S366S368S369

S310S308

S307S306

S425S424

S756

S758S221S759S760S761

S762S763S764

S755S765

S766

S767S768

S769S770

S442

S441S440S439

S438S437

S435S434S433

S432

S431

S430

S429

S428S427

S935

S933S932S931

S930S406S395

S394S393

S392S391

S389S388

S387

S384S383S382

S381S380

S379

S378S377S376

S375S188S480

S191S482 S481S483

S485S486

S487

S488

S511

S512

S513

S514

S515

S516S517S518

S718S719

S720S721

S723

S724

S726

S728

S729

S731S214

S231

S210

S1769

S1767

S1766

S1614S1613

S1011S1009

Correlation Analysis of Link 340

Correlation (%)-23.9 - -20.0-19.9 - -10.0-9.9 - 0.000.1 - 10.010.1 - 20.020.1 - 30.030.1 - 70.0

^ Study Link

±

^

I-494

T.H.212

U.S.1

69

I-394

T.H.62

I-94

T.H.10

0I-494

S143S236S128S234S129S232S230S226S224

S220S218S213

S208

S318S319S320S336S337S338S339S340S341S342S345S346S347

S350S351S352S353

S354S355S356S357S358S366S368S369

S310S308

S307S306

S425S424

S756

S758S221S759S760S761

S762S763S764

S755S765

S766

S767S768

S769S770

S442

S441S440S439

S438S437

S435S434S433

S432

S431

S430

S429

S428S427

S935

S933S932S931

S930S406S395

S394S393

S392S391

S389S388

S387

S384S383S382

S381S380

S379

S378S377S376

S375S188S480

S191S482 S481S483

S485S486

S487

S488

S511

S512

S513

S514

S515

S516S517S518

S718S719

S720S721

S723

S724

S726

S728

S729

S731S214

S231

S210

S1769

S1767

S1766

S1614S1613

S1011S1009

Correlation Analysis of Link 933

Correlation (%)-19.7 - -10.0-9.9 - 0.000.1 - 10.010.1 - 20.020.1 - 30.030.1 - 70.0

^ Study Link

±

^

I-494

T.H.212

U.S.1

69

I-394

T.H.62

I-94

T.H.10

0I-494

S143S236S128S234S129S232S230S226S224

S220S218S213

S208

S318S319S320S336S337S338S339S340S341S342S345S346S347

S350S351S352S353

S354S355S356S357S358S366S368S369

S310S308

S307S306

S425S424

S756

S758S221S759S760S761S762

S763S764

S755S765

S766

S767S768

S769S770

S442

S441S440S439

S438S437

S435S434S433

S432

S431

S430

S429

S428S427

S935

S933S932S931

S930S406S395

S394S393

S392S391

S389S388

S387

S384S383S382

S381S380

S379

S378S377S376

S375S188S480

S191S482 S481S483

S485S486

S487

S488

S511

S512

S513

S514

S515

S516S517S518

S718S719

S720S721

S723

S724

S726

S728

S729

S731S214

S231

S210

S1769

S1767

S1766

S1614S1613

S1011S1009

Correlation Analysis of Link 762

Correlation (%)-44.3 - -30.0-29.9 - -20.0-19.9 - -10.0-9.9 - 0.000.1 - 10.010.1 - 20.020.1 - 30.030.1 - 70.0

^ Study Link

±FIGURE 4 : Correlation of Four Selected Sections for Tuesday between 7:30 AM and 8:30 AM

Page 14: 1 Using Temporal Detrending to Observe the Spatial

Ermagun, Chatterjee, and Levinson 13

significant distance from station 933, which results in a weak positive correlation. Stations 755,1756, and 724 that are strong substitutes with station 933 exhibits a strong negative correlation.2

Noteworthy is that spurious correlation appears in correlation analysis of all links. Al-3though it includes fewer than 10% of the correlation results, it should be kept in mind that it stems4from the nature of using real-world data and a significant number of missing data in loop detector5data samples. For example, we do not have any physical justification to support why there is a6strong negative correlation between stations 762 and 1769 or stations 340 and 935. Instead we7believe it is a spurious correlation.8

Traffic flow varies between weekdays and weekends. This variation results in a different9correlation structure between traffic links. For example, we do not expect a strong negative corre-10lation between traffic links during non-rush hour, as there is little congestion causing traffic flow11to switch to the competitive paths. However, we still expect a strong positive correlation between12the study link and its immediate upstream and downstream links. We also expect the evening rush13hour and morning rush hour are alike in the correlation structure. To test these hypotheses, we14present the correlation analysis of station 719 for different times of day in Tuesday and Saturday15in Figure 5.16

First cut analysis shows a significant difference between rush hour and non-rush hour, and17between weekdays and weekends. In Tuesday non-rush hour, we observe a positive correlation18between upstream and downstream of the study link. Not only does a strong correlation exist19between the immediate links, but also in a second-order upstream link. Traffic flow passes through20links faster in the uncongested traffic condition than congested traffic condition. As a consequence,21the traffic observed in the upstream links are observed in the study link in a shorter time slice, and22thereby they show a stronger positive correlation. A strong point of emphasis is the strength of23this correlation in comparison with morning rush hour. The correlation between upstream and24downstream in non-rush hour is stronger than rush hour, as fewer travelers divert to alternative25routes. As we expected, there is no significant negative correlation in non-rush hour. Comparing26the evening rush hour with morning rush hour, we detect a similar correlation not only in pattern,27but in the magnitude as well. The results indicate dissimilarities between correlation patterns for28Saturday and Tuesday between 7:30 AM and 8:30 AM. The correlation pattern of station 719 for29Saturday between 7:30 AM and 8:30 AM is fairly similar to Tuesday between 10:00 AM and 11:0030AM. It is not surprising as there is no congestion on Saturday early morning, and thereby there is31no negative correlation effect. Interestingly, the negative correlations show up between 10:00 AM32and 11:00 AM on Saturday.33

CLOSING REMARKS34Okutani and Stephanedes (7) directed attention to spatial correlation of traffic links. They did35not recommend incorporating the information of correlated links in traffic forecasting models, but36rather the immediate upstream link. This school of thought has spread through the literature of37short-term traffic forecasting. Using the spatial correlation between links has grown in popularity,38not just because it is a way to augment short-term traffic forecasting models, but also because it is a39way to cope with missing data and path selection. However, the literature provides little empirical40evidence for the correlation of traffic in a real-word network, and is limited to correlation analysis41of links in a series corridor encompassing consecutive links. The literature is comprehensive in42the sense that it deals with positive correlation among the study links and its immediate upstream43and downstream links. However, it is not generic in that it sets broad principles for complementary44

Page 15: 1 Using Temporal Detrending to Observe the Spatial

Ermagun, Chatterjee, and Levinson 14

^

I-494

T.H.212

U.S.1

69

I-394

T.H.62

I-94

T.H.10

0I-494

S143S236S128S234S129S232S230S226S224

S220S218S213

S208

S318S319S320S336S337S338S339S340S341S342S345S346S347

S350S351S352S353

S354S355S356S357S358S366S368S369

S310S308

S307S306

S425S424

S756

S758S221S759S760S761

S762S763S764

S755S765

S766

S767S768

S769S770

S442

S441S440S439

S438S437

S435S434S433

S432

S431

S430

S429

S428S427

S935

S933S932S931

S930S406S395

S394S393

S392S391

S389S388

S387

S384S383S382

S381S380

S379

S378S377S376

S375S188S480

S191S482 S481S483

S485S486

S487

S488

S511

S512

S513

S514

S515

S516S517S518

S718S719S720S721

S723

S724

S726

S728

S729

S731S214

S231

S210

S1769

S1767

S1766

S1614S1613

S1011S1009

Correlation Analysis of Link 719 for Tuesday Non Rush Hour

Correlation (%)-13.7 - -10.0-9.9 - 0.000.1 - 10.010.1 - 20.020.1 - 30.030.1 - 75.0

^ Study Link

±

^

I-494

T.H.212

U.S.1

69

I-394

T.H.62

I-94

T.H.10

0I-494

S143S236S128S234S129S232S230S226S224

S220S218S213

S208

S318S319S320S336S337S338S339S340S341S342S345S346S347

S350S351S352S353

S354S355S356S357S358S366S368S369

S310S308

S307S306

S425S424

S756

S758S221S759S760S761

S762S763S764

S755S765

S766

S767S768

S769S770

S442

S441S440S439

S438S437

S435S434S433

S432

S431

S430

S429

S428S427

S935

S933S932S931

S930S406S395

S394S393

S392S391

S389S388

S387

S384S383S382

S381S380

S379

S378S377S376

S375S188S480

S191S482 S481S483

S485S486

S487

S488

S511

S512

S513

S514

S515

S516S517S518

S718S719S720S721

S723

S724

S726

S728

S729

S731S214

S231

S210

S1769

S1767

S1766

S1614S1613

S1011S1009

Correlation Analysis of Link 719 for Tuesday Evening Rush Hour

Correlation (%)-37.9 - -30.0-29.9 - -20.0-19.9 - -10.0-9.9 - 0.000.1 - 10.010.1 - 20.020.1 - 30.030.1 - 60.0

^ Study Link

±

^

I-494

T.H.212

U.S.1

69

I-394

T.H.62

I-94

T.H.10

0I-494

S143S236S128S234S129S232S230S226S224

S220S218S213

S208

S318S319S320S336S337S338S339S340S341S342S345S346S347

S350S351S352S353

S354S355S356S357S358S366S368S369

S310S308

S307S306

S425S424

S756

S758S221S759S760S761

S762S763S764

S755S765

S766

S767S768

S769S770

S442

S441S440S439

S438S437

S435S434S433

S432

S431

S430

S429

S428S427

S935

S933S932S931

S930S406S395

S394S393

S392S391

S389S388

S387

S384S383S382

S381S380

S379

S378S377S376

S375S188S480

S191S482 S481S483

S485S486

S487

S488

S511

S512

S513

S514

S515

S516S517S518

S718S719S720S721

S723

S724

S726

S728

S729

S731S214

S231

S210

S1769

S1767

S1766

S1614S1613

S1011S1009

Correlation Analysis of Link 719 for Saturday Early Morning

Correlation (%)-14.7 - -10.0-9.9 - 0.000.1 - 10.010.1 - 20.020.1 - 30.030.1 - 75.0

^ Study Link

±

^

I-494

T.H.212

U.S.1

69

I-394

T.H.62

I-94

T.H.10

0I-494

S143S236S128S234S129S232S230S226S224

S220S218S213

S208

S318S319S320S336S337S338S339S340S341S342S345S346S347

S350S351S352S353

S354S355S356S357S358S366S368S369

S310S308

S307S306

S425S424

S756

S758S221S759S760S761

S762S763S764

S755S765

S766

S767S768

S769S770

S442

S441S440S439

S438S437

S435S434S433

S432

S431

S430

S429

S428S427

S935

S933S932S931

S930S406S395

S394S393

S392S391

S389S388

S387

S384S383S382

S381S380

S379

S378S377S376

S375S188S480

S191S482 S481S483

S485S486

S487

S488

S511

S512

S513

S514

S515

S516S517S518

S718S719S720S721

S723

S724

S726

S728

S729

S731S214

S231

S210

S1769

S1767

S1766

S1614S1613

S1011S1009

Correlation Analysis of Link 719 for Saturday Before Noon

Correlation (%)-24.5 - -20.0-19.9 - -10.0-9.9 - 0.000.1 - 10.010.1 - 20.020.1 - 30.030.1 - 70.0

^ Study Link

±FIGURE 5 : Comparison of Correlation for Different Times and Days

Page 16: 1 Using Temporal Detrending to Observe the Spatial

Ermagun, Chatterjee, and Levinson 15

nature of traffic links, and leaves the correlation analysis of competitive traffic links for later.1This empirical study instead applies a three-dimensional data detrending algorithm and2

tests it on a grid-like network topology consisting of both competitive and complementary traffic3links. This methodological approach enabled us to shed more light on the understanding of the4traffic phenomena. We added to the body of knowledge on short-term traffic forecasting problem5by capturing the realistic spatial correlation between traffic links. The key findings from correlation6analysis of 140 traffic links and 54 ramps in the Minneapolis - St. Paul network are as follows:7

• In a network comprising links in parallel and series, both negative and positive correlation8shows up between links.9

• The strength of correlation varies by time-of-day and day-of-week.10

• The strong negative correlation is observed in rush hours, when congestion affects travel11behavior. This correlation occurs mostly in parallel links, and in far upstream links where12travelers receive information about congestion (for instance from media, variable mes-13sage signs, or personal observation of propagating shockwaves) and are able to switch to14substitute paths.15

• Irrespective of time-of-day and day-of-week, a strong positive correlation is observed16between upstream and downstream sections. This correlation is stronger in uncongested17regimes, as traffic flow passes through the consecutive links in a shorter time and there is18no congestion effect to shift or stall traffic.19

The sub-network used in this study includes a significant number of missing data pertaining20to both traffic links and time-of-day. To extract more accurate correlation between traffic links, we21need data that represents all traffic demands in the network for a specific time slice. We argue22that accuracy, robustness, and adaptivity are fundamental for successful implementation of short-23term traffic prediction models in advanced traveler information service. The proposed algorithm24is practical for deployment in any traffic network to achieve persistent and accurate correlation25between traffic links. Spelling out the details of how to integrate these correlation effects into26short-term traffic forecasting models remains a research challenge.27

REFERENCES28[1] Eleni I Vlahogianni, Matthew G Karlaftis, and John C Golias. Spatio-temporal short-term29

urban traffic volume forecasting using genetically optimized modular networks. Computer-30Aided Civil and Infrastructure Engineering, 22(5):317–325, 2007.31

[2] Ali Zockaie, Yu Nie, Xing Wu, and Hani Mahmassani. Impacts of correlations on reliable32shortest path finding: A simulation-based study. Transportation Research Record: Journal33of the Transportation Research Board, (2334):1–9, 2013.34

[3] Li Li, Yuebiao Li, and Zhiheng Li. Efficient missing data imputing for traffic flow by consid-35ering temporal and spatial dependence. Transportation research part C: emerging technolo-36gies, 34:108–120, 2013.37

[4] Eleni I Vlahogianni, John C Golias, and Matthew G Karlaftis. Short-term traffic forecasting:38Overview of objectives and methods. Transport reviews, 24(5):533–557, 2004.39

Page 17: 1 Using Temporal Detrending to Observe the Spatial

Ermagun, Chatterjee, and Levinson 16

[5] Eleni I Vlahogianni, Matthew G Karlaftis, and John C Golias. Short-term traffic forecasting:1Where we are and where we’re going. Transportation Research Part C: Emerging Technolo-2gies, 43:3–19, 2014.3

[6] Alireza Ermagun and David Levinson. Spatiotemporal traffic forecasting: Review and pro-4posed directions. 2016.5

[7] Iwao Okutani and Yorgos J Stephanedes. Dynamic prediction of traffic volume through6kalman filtering theory. Transportation Research Part B: Methodological, 18(1):1–11, 1984.7

[8] Antony Stathopoulos, Loukas Dimitriou, and Theodore Tsekeris. Fuzzy modeling approach8for combined forecasting of urban traffic flow. Computer-Aided Civil and Infrastructure9Engineering, 23(7):521–535, 2008.10

[9] Srinivasa Chandra and Haitham Al-Deek. Cross-correlation analysis and multivariate predic-11tion of spatial time series of freeway traffic speeds. Transportation Research Record: Journal12of the Transportation Research Board, (2061):64–76, 2008.13

[10] Yiannis Kamarianakis and Poulicos Prastacos. Forecasting traffic flow conditions in an urban14network: comparison of multivariate and univariate approaches. Transportation Research15Record: Journal of the Transportation Research Board, (1857):74–84, 2003.16

[11] Shiliang Sun, Changshui Zhang, and Yi Zhang. Traffic flow forecasting using a spatio-17temporal bayesian network predictor. In International Conference on Artificial Neural Net-18works, pages 273–278. Springer, 2005.19

[12] Yang Yue and Anthony Gar-On Yeh. Spatiotemporal traffic-flow dependency and short-term20traffic forecasting. Environment and Planning B: Planning and Design, 35(5):762–771, 2008.21

[13] Su Yang, Shixiong Shi, Xiaobing Hu, and Minjie Wang. Spatiotemporal context awareness22for urban traffic modeling and prediction: sparse representation based variable selection. PloS23one, 10(10):e0141223, 2015.24

[14] Henry X Liu, Will Recker, and Anthony Chen. Uncovering the contribution of travel time25reliability to dynamic route choice using real-time loop data. Transportation Research Part26A: Policy and Practice, 38(6):435–453, 2004.27