50
Subway and Road Congestion * Yizhen Gu, Junfu Zhang and Ben Zou October 28, 2018 Abstract This paper estimates the effect of urban rapid transit rail systems (henceforth subways) on road congestion. Existing empirical studies have been constrained by small numbers of treated cases and poor measures of road speed. We study 45 subway line openings across 25 Chinese cities between August 2016 and December 2017. We use crowd-sourced big data from mobile devices that generate high-frequency speed information at the road segment level. We use novel data from a leading digital map provider that generates traffic speed data at the road segment level from real-time location information from personal mobile devices. We adopt a difference-in-differences approach that compares changes in speed in nearby road segments after a subway line opens to contemporaneous changes in road speed in control cities. We find that subway increases rush-hour speed in nearby roads by about 5%. The magnitude of the effect exhibit a hump-shaped pattern: It increases in the first few weeks after the line opening and then declines and stabilizes at around 5%. Increases in speed takes place immediately after the line opening and remains stable in magnitude after 12 months, the maximum length of our sample period. The effect is concentrated in initially congested segments and declines quickly over distance. Evidence on road speed is corrob- orated by evidence on substitution patterns between modes of transportation. City-level data on public transit ridership show that there is a clear substitution between subway and bus. Household-level data from transportation surveys in Beijing show that households in neighborhoods with improved access to subway increase subway trips and reduce car and bus trips. Simple back-of-the-envelope calculations suggest that benefits from saved time for those who commute by car are small compared with the construction and operation costs of subway. Keywords: Subway, congestion, public transit JEL Classification: R41, R42, L92 * We thank Victor Couture, Mark Jacobsen, Cliff Winston, and participants at Brookings-Tsinghua Center, Fudan University, Jinan University for helpful comments. We have benefited from outstanding research assistance by Hui Hu, Wenwei Peng and Kai Wu. All views expressed and errors are our own. Gu: Institute for Economic and Social Research, Jinan University. Email: [email protected]. Zhang: Department of Economics, Clark University. Email: [email protected]. Zou: Department of Economics, Michigan State University. Email: [email protected]

Subway and Road Congestion - Academic Profileecon.msu.edu/seminars/docs/Congestion_Oct2018.pdfsubway lines in the sample often requires data that cover many cities over many years,

  • Upload
    others

  • View
    0

  • Download
    0

Embed Size (px)

Citation preview

Page 1: Subway and Road Congestion - Academic Profileecon.msu.edu/seminars/docs/Congestion_Oct2018.pdfsubway lines in the sample often requires data that cover many cities over many years,

Subway and Road Congestion∗

Yizhen Gu, Junfu Zhang and Ben Zou†

October 28, 2018

Abstract

This paper estimates the effect of urban rapid transit rail systems (henceforth subways)on road congestion. Existing empirical studies have been constrained by small numbers oftreated cases and poor measures of road speed. We study 45 subway line openings across25 Chinese cities between August 2016 and December 2017. We use crowd-sourced big datafrom mobile devices that generate high-frequency speed information at the road segmentlevel. We use novel data from a leading digital map provider that generates traffic speeddata at the road segment level from real-time location information from personal mobiledevices. We adopt a difference-in-differences approach that compares changes in speed innearby road segments after a subway line opens to contemporaneous changes in road speedin control cities. We find that subway increases rush-hour speed in nearby roads by about5%. The magnitude of the effect exhibit a hump-shaped pattern: It increases in the first fewweeks after the line opening and then declines and stabilizes at around 5%. Increases inspeed takes place immediately after the line opening and remains stable in magnitude after12 months, the maximum length of our sample period. The effect is concentrated in initiallycongested segments and declines quickly over distance. Evidence on road speed is corrob-orated by evidence on substitution patterns between modes of transportation. City-leveldata on public transit ridership show that there is a clear substitution between subway andbus. Household-level data from transportation surveys in Beijing show that households inneighborhoods with improved access to subway increase subway trips and reduce car andbus trips. Simple back-of-the-envelope calculations suggest that benefits from saved time forthose who commute by car are small compared with the construction and operation costs ofsubway.

Keywords: Subway, congestion, public transitJEL Classification: R41, R42, L92

∗We thank Victor Couture, Mark Jacobsen, Cliff Winston, and participants at Brookings-Tsinghua Center,Fudan University, Jinan University for helpful comments. We have benefited from outstanding research assistanceby Hui Hu, Wenwei Peng and Kai Wu. All views expressed and errors are our own.†Gu: Institute for Economic and Social Research, Jinan University. Email: [email protected]. Zhang:

Department of Economics, Clark University. Email: [email protected]. Zou: Department of Economics,Michigan State University. Email: [email protected]

Page 2: Subway and Road Congestion - Academic Profileecon.msu.edu/seminars/docs/Congestion_Oct2018.pdfsubway lines in the sample often requires data that cover many cities over many years,

1 IntroductionTraffic congestion is an important challenge facing many cities around the globe. In the

United States, it is estimated that an average commuter loses 42 hours per year in traffic conges-tion.1 With an average hourly wage of 20 dollars, this amounts to 162 billion dollars wasted insnarled traffic every year. In major cities in the developing world, with rapid increases in pop-ulation and car ownership, the challenge from worsening traffic congestion is even more acute.

Cities have implemented various policies aiming at reducing traffic congestion. Demand-side policies include restrictions on auto ownership and use (e.g., Li, in press; Davis, 2008; Guet al., 2017). On the supply side, urban rapid transit rail systems (henceforth subways) are con-sidered as an effective way to reduce congestion because they have large capacity and usuallydo not compete with roads for land. By 2014 there were 171 cities worldwide that had a subwaysystem in operation (Gendron-Carrier et al., 2018). Cities in developing countries account fora large share of recent expansions in subways (Gonzalez-Navarro & Turner, 2018). In Chinaalone, total subway length increased from less than 200km in 2000 to over 5,000km in 2018.2

Depite their potential benefits, subways are very expensive to build and operate. Yang etal. (2018) estimate that every kilometer of Beijing’s subway costs 92 million US dollars. Thenumber is likely to be much higher in developed countries. Subway fares are often highly sub-sidized and do not cover operating costs (Winston & Maheshri, 2007). Parry & Small (2009)estimate that government subsidies to the metro rail systems account for 50% of the operatingcost in Washington, DC and over 80% in Los Angeles. The number in Beijing is estimated to bealso over 50%.3 Evidence on the benefit of subway is at best limited. Therefore, there remainsa debate on whether constructing and subsidizing subways are justified (Winston & Maheshri,2007; Baum-Snow & Kahn, 2005; Parry & Small, 2009).

This paper focuses on whether subways reduce road congestion. Empirical studies face twoimportant constraints. First, events of subway line openings are infrequent. Including multiplesubway lines in the sample often requires data that cover many cities over many years, whichare hard to come by. The second challenge is the lack of good data on traffic speed. Severalstudies back out speed from travel diaries available in household transportation surveys (e.g.,Couture et al., 2018). But these surveys are infrequent and have limited sample sizes. Speeddata derived from such surveys may suffer from substantial measurement error. Traffic speeddata can also be derived from traffic cameras, wired loop detectors, or GPS-enabled bus andtaxi fleets. These data are arguably not representative of the overall traffic pattern and are dif-ficult to compare over time and across cities. These data limitations prompt existing papers to

1https://www.autoinsurancecenter.com/traffic-jammed.htm.2https://en.wikipedia.org/wiki/Urban_rail_transit_in_China, accessed on June 21, 2018. The 2000 data are

based on the authors’ calculation.3https://www.scmp.com/news/china/article/1649929/public-anger-beijing-cuts-bus-and-subway-ticket-

price-subsidies

1

Page 3: Subway and Road Congestion - Academic Profileecon.msu.edu/seminars/docs/Congestion_Oct2018.pdfsubway lines in the sample often requires data that cover many cities over many years,

focus on a single city over a limited period of time, usually adopting an event study approach(e.g., Chen & Whalley, 2012; Anderson, 2014; Yang et al., 2018).

This paper uses the recent massive construction of subways across Chinese cities as the set-ting. It focuses on 45 new subway lines (including 7 extensions to the existing lines) in 25 citiesopened during the 18 months between August 2016 and December 2017. The large number ofsubway line openings in a relatively short period of time from cities within the same countryallows us to credibly purge out common macro shocks and secular trends. We are able to trackup to 12 months for about half of the new subway lines, a significant improvement on existingstudies that usually look at a much shorter period of time.

We use a novel source of traffic speed data from China’s leading digital map service provider.The company offers a free digital map application on smartphones and other handheld mobiledevices. As long as the location service is turned on for the application,4 the mobile device re-mits bits of location data every few seconds to the company’s database. The company analyzesthese data instantaneously and generate real-time speed information at the road segment level.We obtain hourly speed data for selected road segments between August 1, 2016 and January31, 2018. Road segments in the sample are those within the 2.5km buffer zones of 5km seg-ments of existing, new, and planned subway lines. They cover all 42 cities in Mainland Chinawith existing or planned subway systems. The sample includes over 3 billion segment-by-hourspeed observations from about 1.3 million unique road segments.

To capture changes in speed, log speed is first regressed on a full set of fixed effects whereroad segment, day of week, and hour of the day are fully interacted. The residual thus cap-tures the deviation from the “usual” traffic speed in the same road segment at the same houron the same day of week. The residual log speed is then used as the outcome variable in thedifference-in-differences estimation. In the baseline, we use road segments that are likely tobe directly affected by the new subway lines as the treated sample. Those are road segmentsthat are in the alternative driving routes for trips that could use the new subway lines. Roadsegments in cities that had existing or planned subways but did not have any new line openingduring the sample period serve as controls. The controls help capture seasonality and macrotrends common to all cities.

Speed in treated and control road segments exhibit parallel trends prior to subway open-ings. Subway has is a positive effect immediately after its opening. In the first week since a lineopens, speed in treated road segments increases by about 2.5% relative to that in the controlsegments. The effect increases to about 6% in the 5th week before it declines and stabilizes atslightly below 5%. By the 12th month after line opening, the effect remains stable and statis-tically significant. The results are robust to the inclusion of different sets of fixed effects, timetrends, longer sample periods, or different cuts into sub-groups. Various alternative economet-ric models also yield quantitatively similar results.

4With user’s permission, the application can also access real-time location information in the background.

2

Page 4: Subway and Road Congestion - Academic Profileecon.msu.edu/seminars/docs/Congestion_Oct2018.pdfsubway lines in the sample often requires data that cover many cities over many years,

We then include all road segments in cities with new subway lines to study how the effectspreads through the road network. There is substantial heterogeneity by the characteristics ofroad segments. The effect is concentrated in road segments that are initially congested andthose that are parallel to the new subway line. The effect declines quickly over distance fromthe new subway line: speed in road segments within 1 kilometer from the new subway lineincreases by 4%, beyond 1 kilometer the effect quickly declines to around 2%. We also findsuggestive evidence of network effect of the subway system: conditional on the distance to thenew subway line, road segments closer to the existing subway lines experience larger increasesin speed.

To understand the sources of the congestion-relieving effect of subways, we collect annualstatistics of urban transportation at the city level and study the substitution patterns among dif-ferent modes of transportaiton. There is a clear substitution between subway and bus. 100 moresubway trips are associated with 22 fewer bus trips and 5 kilometers less bus mileage. Usingdata from household transportation surveys in Beijing, we find that households in neighbor-hoods that experienced improved access to subway increased subway trips and reduced carand bus trips. Although these results should be interpreted as correlations instead of causali-ties, they lend additional credibility to our results on road speed.

We conduct a simple back-of-the-envelope calculation of the benefit from reduced conges-tion and compare it with the costs of building and operating subway systems. Using data fromBeijing, we show that monetized time value from reduced traffic congestion accounts for only atiny fraction of the total cost of the city’s subway system. Beijing has the busiest subway systemin the country and its roads are among the most congested. Benefits in other cities are likely tobe even smaller. Potential large benefits may come from value of induced trips, as discussed inParry & Small (2009) and Severen (2018), which our paper is unable to assess.

Although congestion-relief is one of the most cited reasons by the proponents of subways,direct empirical evidence is limited. Existing studies suffer from the lack of data, which in turnlimits the use of empirical approaches that can credibly identify causal effects. Using a panel ofU.S. cities, Winston & Langer (2006) find that cities with longer rail transit mileage are associ-ated with lower congestion costs. Yang et al. (2018) finds that the city-level “congestion index”drops sharply following subway openings in Beijing. Yet the event study design cannot purgeout potential time trends. Other studies found significant reductions in air pollution upon theopening of a subway line (Chen & Whalley, 2012; Gendron-Carrier et al., 2018), which impliesreductions in traffic. Our paper uses many subway line openings, new sources of data at thegranular level and an empirical design that precludes many potential confounding factors, itprovides credible evidence of the congestion-reduction effect of subway lines.

More broadly, this paper is related to the literature on the value of public transit. In theUnited States, there has been a long debate on whether constructing and subsidizing rapidtransit systems is worthwhile (Voith, 1991; Baum-Snow & Kahn, 2005; Winston & Maheshri,

3

Page 5: Subway and Road Congestion - Academic Profileecon.msu.edu/seminars/docs/Congestion_Oct2018.pdfsubway lines in the sample often requires data that cover many cities over many years,

2007; Parry & Small, 2009). A key component of the potential benefit is from congestion re-duction. Some studies have used interruptions to the public transit system such as strikes toevaluate the benefits of public transit (e.g., Anderson, 2014; Adler & van Ommeren, 2016). Forexample, Anderson (2014) studies the 2004 strike of public transit workers in Los Angeles. Hefinds that highway delay increased by 47 percent during the strike. However, removing exist-ing systems are likely to induce very different responses compared with expansion of services.These interruptions are also temporary and silent on long-run behavioral adjustments.

Although building additional road capacity is a natural response to traffic congestion andis widely adopted by policy-makers, there is debate on whether this supply-side approach iseffective. The “fundamental law” of highway congestion (Downs, 1962, 2000) suggests that theelasticity of vehicle-kilometers traveled (VKT) with regard to lane kilometers of roads is one.Thus adding road capacity does not lead to less congestion. Hsu & Zhang (2014) suggest thelaw holds for Japanese highways. Using a panel of US cities, Duranton & Turner (2011) findempirical evidence that the fundamental law may apply to urban roads as well. Although wedo not look at VKT, our finding suggests that the elasticity of demand for automobile trips isless than one in Chinese cities. Building subways does reduce congestion.

The remainder of the paper is structured as follows. Section 2 describes the setting, data andthe sample. Section 3 presents the empirical analyses. Section 4 documents corroborating evi-dence of the substituting patterns between modes of transportation. Section 5 discusses benefitsfrom saved commuting time and compare them with the costs of subways. Section 6 concludes.

2 Background, Data and Sample

2.1 Subway Systems in Chinese Cities

China experienced a large boom in subway construction in the past two decades. Usingcity-level data from Statistical Yearbooks, Figure 1 shows changes in subway length and rider-ship in China. In 2001, only three cities in Mainland China, Beijing, Shanghai and Tianjin, hada subway system. The combined length of all subway lines was below 400km. By the end of2017, 30 cities had a total of 4476km of subway lines, and another 12 cities had their first subwaylines under construction. Today China boasts the world’s longest, second and fourth longestmetro systems. Half of the top ten busiest subway systems in the world are in China. Subwayridership increased accordingly, from just under 1 billion in 2001 to about 16 billion in 2017.

Rapid subway construction is a response to rapid growth in population and car ownershipin China’s major cities. Overall urbanization rate increased from 35% in 2000 to 58% in 2017.Much of the increased urban population concentrated in large cities. Rapid increase in carownership helps make major Chinese cities among the world’s most congested and the mostpolluted. Subways are regarded by many city governments as the essential infrastructure toreduce congestion and pollution. Nevertheless, recent boom in subway construction is also a

4

Page 6: Subway and Road Congestion - Academic Profileecon.msu.edu/seminars/docs/Congestion_Oct2018.pdfsubway lines in the sample often requires data that cover many cities over many years,

result of easy credit following economic slowdown in the early 2010s. Local governments ofmany smaller cities, eager to take on large infrastructure projects that could bolster economicgrowth, threw hundreds of millions of dollars behind the new subway system. Despite thelarge amounts of investment, whether these subway lines achieved the stated goal of reducingcongestion is untested empirically.

2.2 Subway Lines in the Sample

We intend to investigate the effect of subway lines on the speed of nearby roads. We focuson subway lines that opened between August 1, 2016 and December 31, 2017. There were 45such lines (including 7 extensions to the existing lines) across 25 cities. Table 1 Panel A liststhese lines. The 25 cities include China’s largest metropolis, Beijing, Shanghai, Shenzhen, andGuangzhou. Among others, most are provincial capital cities. The table lists official openingdate for each new subway line. It is worth noting that a disproportionately large share of thelines opened towards the end of the calendar year. 25 out of the 45 subway lines opened inDecember. There were 7 new line openings on Dec 28, 2016 alone, and another 6 on Dec 28,2017. There are reasons to believe that the official opening dates are selected. The clustering ofopening dates in a certain period of the year can be correlated with some confounding factors.Indeed, The end of the calendar year is also the start of China’s holiday season. Chinese NewYear, the nation’s most important holiday, follows a lunar calendar and usually falls betweenmid-January and February. With schools closed and economic activities slowing down, it pos-sible that the roads will be less congested with or without the opening of a new subway line.Thus it is important to purge out seasonality in traffic patterns.

For each city that has at least one new subway line during the period (henceforth “treatedcity”), whenever possible, we also select an existing or planned subway line.5 Table 1 PanelB lists these lines. Among the 25 such lines there are 22 existing and 3 planned. The openingdates of these lines are, in general, at least one year apart from the new subway lines. So theyshould not confound with the effects from the treated lines.

There are 17 cities with existing or planned subway systems but did not have a new lineduring the sample period. In each of these control cities, we pick the first subway line to bebuilt or the latest line completed. There are 5 existing lines and 12 planned lines. Control citiesare much smaller and less affluent than the treated cities. We will test empirically whether theyare suitable as controls.

5The original plan was to use road segments near these lines as controls if an identification based on cross-citycomparison does not work. In order for them to be potential comparisons, whenever possible, these lines arechosen such that they are at least 3 kilometers away from the treated subway line, and have a comparable distanceto the downtown. Road segments near these lines are used to study how the effects of new subway lines spreadthrough the road network.

5

Page 7: Subway and Road Congestion - Academic Profileecon.msu.edu/seminars/docs/Congestion_Oct2018.pdfsubway lines in the sample often requires data that cover many cities over many years,

2.3 Data on Road Speed

The company we work with is China’s leading player in providing digital map and onlinenavigation services. The company runs a digital map application on mobile devices, whichis similar to Google Maps or Apple Maps. With the application installed and location serviceturned on, mobile devices remit bits of location information every few seconds to the com-pany’s data center. Location of the device can be matched with a digital map of roads. Speedcan be calculated with multiple location records and time lapse in between. The companyboasts 280 million active users at the monthly basis. With large amounts of data flowing intothe company’s server every second, it is able to compute real-time speed at very fine geographiclevel. The company then displays real-time speed on the application and use it to caculate opti-mal routes between any two points instantaneously. The company also has a web-based digitalmap that allows users to view real-time traffic and plan trips using different modes of trans-portation.6

We obtain average hourly speed at the road segment level. A road segment is a short stretchof a road with an average length of about 30 meters. The time span of the data is between Au-gust 1, 2016 and January 31, 2018, except for the periods between September 1 and September10, 2016, and between October 10 and November 30, 2016, for which the original data were notaccessible at the time the company prepared data for us. For each day when data are available,we have hourly speed information for each road segment between 7AM and 7PM, which cov-ers both morning and evening rush hours. For each road segment we know the name of theroad and the sets of coordinates that cover its start and end point. From these coordinates wecalculate the length and the direction of the segment (in both ways). Road segments are classi-fied into five hierarchical categories: highways, urban expressways, arterial streets, sub-arterialstreets, and local streets.

The speed data also comes with a congestion index, which is calculated as the time neededto travel through the road segment under the current road speed relative to the time neededwhen there is no traffic. The traffic-free speed is the average speed on the same road segmentbetween 0 AM and 5 AM. Traffic-free speed has a congestion index equal to 1.7

2.4 Road Segments

Ideally, we would like to obtain data on all road segments in the city. However, the verylarge sample size makes it unrealistic. For example, within its 5th ring road, Beijing has an areaof about 625 squared kilometers. It is estimated that there are 250 million road segments, whichamounts to about 3.3 trillion observations of hourly traffic over the 18 months in our study.

6The navigation function of major digital map providers use the underlying real-time speed data. Withhundreds of millions of active users, the speed data are usually accurate. In Appendix B, we conduct severalchecks of the data quality.

7Congestion index can be below 1.

6

Page 8: Subway and Road Congestion - Academic Profileecon.msu.edu/seminars/docs/Congestion_Oct2018.pdfsubway lines in the sample often requires data that cover many cities over many years,

The size of the data with all road segments from all cities with existing and planned subwaylines would be astronomical. The next best option would be to select a random sample of roadsegments to get a manageable sample size. This was also proven to be difficult for at least tworeasons. First, without prior knowledge of the magnitude of the effect, it is difficult to deter-mine the percentage of the random sample and the size of the area of study. This problem isaggravated by the prior belief that the effect of subway on road congestion is likely to decreaseby distance. Therefore, a random sample with a small sampling percentage of all the roadsegments in the city may render under-powered estimates for segments that are likely to be af-fected. Second, the time of the engineers at the data provider was also a constraint. A stratifiedsampling procedure that over-samples road segments near new subway lines may give us bothenough sample size in regions that are most likely to be affected and an overall representativesample. However, resources allocated to help us extract and prepare the data were too limitedfor such a moderately complicated sampling procedure.

Faced with these constraints, we extract speed information on all road segments that arein the neighborhood of the chosen subway lines. We first select a 5-kilometer stretch of eachsubway line listed in Table 1. On average the 5-kilometer segment includes 3 subway stations.We then create a 2.5-kilometer buffer zone on both sides of the segment, and extract all roadsegments that lie within the buffer zone. Because most subways are designed to alleviate con-gestion in urban centers, in general, we pick subway segments as close to the downtown aspossible. In cities with multiple subway lines included in the sample, we try to adjust the posi-tions of the segments such that their buffer zones do not overlap, so that we can cover as manyroad segments as possible. Road segments that are some distance away from the new subwayline allow us to gauge how the effect spreads to other parts of the city. The raw data have 1.3million unique road segments and over 3 billion observations on hourly speed.

2.5 Baseline Sample

Road Segments directly Affected by the New Subway Line

Subway lines are often built in places with heavy traffic and aim to alleviate congestion onnearby roads that are over capacity. As a first pass, it is interesting to investigate whether sub-way has any effect on alleviating road congestion. Distance to the subway is arguably relatedto how much a road segment is affected, but it is an imperfect measure. Whether the subway iseffective in diverting traffic from certain roads depends on the substitutability between subwaytrips and traffic through these roads.

Road segments that are close substitutes to the new subway line can be identified by usingthe trip planning feature of the digital map application. The road segments that are directlyaffected by the new subway line are those in route if one chooses to drive instead of taking thesubway. Specifically, we first divide the buffer zone around each new subway line into 1km-by-1km grids. Between any pair of grids, we find the best public transit route (or routes, as the

7

Page 9: Subway and Road Congestion - Academic Profileecon.msu.edu/seminars/docs/Congestion_Oct2018.pdfsubway lines in the sample often requires data that cover many cities over many years,

digital map service usually recommends several best alternative routes) and save the collectionof pairs of grids whose best public transit routes involve the newly built subway line. For eachpair of grids in this collection, we find the best driving route(s). Road segments in these routesare those directly affected by the new subway line. We repeat the same process under the typ-ical traffic conditions for morning and afternoon rush hours on weekdays. Out of hundreds ofthousands of road segments in our sample, we pick about 10,000 road segments that are mostlikely to be directly affected by the new subway line.

Comparing Cases

In the baseline, there are 45 groups of treated segments, each group represents road seg-ments directly affected by a new subway line. Our baseline identification strategy is a difference-in-differences (DID) specification, where the control road segments are those near the 17 exist-ing or planned subway lines in control cities. The inclusion of control cities and the use ofthe difference-in-differences approach is to eliminate the macro trends and seasonality that arecommon to all cities.8

Because subway lines open at different times, we create a variable for each treated line thatindicates the time relative to the opening date. For each treated line we can use road segmentsin all 17 cities as controls, creating the corresponding relative-time variable for the control linesas the same as that for the treated line. Because sample size is not an issue, instead of replicat-ing 45 copies of the control segments, we randomly divide control segments into 45 equal-sizedpieces and assign them to each treated line. Road segments near each treated subway line andthe corresponding control segments form a “comparing case”. In the difference-in-differencesestimation, we will cluster the standard errors at the case level.

Further Restrictions on the Sample

We impose several additional restrictions on the sample. We first drop weekends and na-tional holidays. We keep only morning rush hours between 7 AM and 9 AM and evening rushhours between 5 PM and 7 PM.

We drop local streets from the sample due to concerns for the quality of speed data on thosenarrow, less travelled streets. GPS location can also be less precise on narrow streets in denseurban areas. Many local streets also have a substantial portion of missing values, presumablybecause no car has passed through during the hour. Although we could assign traffic-free speedto these missing values, we chose to be conservative and not to do so. We note that local streetsmay play an important role in alleviating congestion on the main roads. Akbar & Duranton

8Previous studies on the effect of subway typically adopt an event study approach (Chen & Whalley, 2012;Gendron-Carrier et al., 2018; Yang et al., 2018). This approach relies only on variation over time for identificationand is essentially unable to single out potential confouding factors such as time trend and seasonality. This imposesa particularly serious challenge in our case as a large share of new line openings coincide with the holiday season.

8

Page 10: Subway and Road Congestion - Academic Profileecon.msu.edu/seminars/docs/Congestion_Oct2018.pdfsubway lines in the sample often requires data that cover many cities over many years,

(2017) show that local streets are often not congested even when the main streets are very con-gested. Thus the existence of small alleys essentially puts an upper bound to road congestion.We leave the substitution patterns between main and minor roads to future research.

For each treated subway line and its controls, we include observations that are up to 48weeks post opening. Although our sample starts in August 1, 2016, the longest time we cantrack a new subway is about 1 year after its opening. It is due to three reasons. First, we donot have about two-month-worth of observations between September and December 2016. Wealso use about one and a half months as the pre-treatment period. Second, the first large batchof subway openings took place in December 2016. Third, in cities with multiple line openings,the gap between two openings is usually about one year. Tracking the effect up to 48 weeksguarantees that we do not include effects from further expansions of the subway system. Theresults show that the magnitude of the effect stabilizes after the first four months post opening,so the 48-week restriction is unlikely to miss much of the dynamism of the effect.

We also restrict the baseline sample up to 6 weeks prior to the opening. Subway construc-tion itself may substantially affect traffic conditions on nearby roads, thus making pre-trenduninformative about the nature of traffic conditions on these roads. We take advantage of anengineering fact of subway construction. Once the digging part of construction is complete,before opening the subway to the public it still requires some time testing the hardware andsoftware systems. For the subway lines we study here, the testing period usually runs 2 to3 months before opening to the public. We choose 6 weeks prior to the opening as the pre-treatment period, assuming that road traffic during this time is unaffected by construction.9 Weprovide robustness checks to validate this assumption. In particular, the inclusion of flexibletime trends and a longer pre-treatment period also yield quantitatively similar results. An ad-ditional 10% of the road segments are dropped from the sample due to missing values duringthe sample period. We have a perfectly balanced panel of road segments.10

The official opening of a subway line usually involves a ceremony with the presence of gov-ernment officials and media. In order to make sure that the system is up to the task of carryingreal passengers, in some cases there were “test rides” before the official opening, during whichresidents were invited to ride with the new subway line, often with free or reduced fare. Theexistence of those “test ride” periods may generate a spurious positive effect on road speed in

9The construction of subway usually takes several major steps. The first step invovles digging tunnels.Modern shield tunneling technology can advance tunnels underneath the roads. Yet it still requires operations onthe ground, such as lifting the tunnel boring machine up and down, trucking rocks and pebbles out, and pumpingwater from underground. These operations are likely to affect traffic. Then tracks are laid and electrified. Usuallyone year prior to the line opening, scale test cars are run on the electrified tracks. In the mean time subwaystations are under construction. All these constructions are likely to affect traffic on nearby roads. Usually within6 months before opening major constructions are completely, unpassengered trains are tested. while in the meantime final touches on the system may still be in progress. The final approval for public operation comes after apanel of specialists inspect the system, which usually takes place weeks ahead of the official opening. AppendixC shows two examples of the detailed engineering processes from two subway lines in the sample.

10Although with gaps, because we do not have data for about 2 months in 2016.

9

Page 11: Subway and Road Congestion - Academic Profileecon.msu.edu/seminars/docs/Congestion_Oct2018.pdfsubway lines in the sample often requires data that cover many cities over many years,

the pre-treatment period. We find detailed project progresses for each subway line from varioussources.11 We treat these test ride periods differently depending on whether they are contigu-ous to the official opening date. If the official opening date immediately follows the test rideperiod, we re-define the opening date as the first day of test ride. If the test ride period endsseveral days prior to the official opening date, we drop days in the test ride period.

Summary Statistics

Table 2 reports the summary statistics of the treated and control road segments in the base-line sample. Arterial and sub-arterial streets account for the majority of road segments in thesample. Treated and control road segments have similar compositions of road types. Bothhave about half a percent of highways, 2% of urban expressways. Control cities have some-what fewer arterial streets but not by a large margin. This is reassuring because it shows thatalthough the control cities are in general much smaller and less prosperous than the treatedcities, at least the urban structures of the studied areas are not substantially different from eachother. The average speed on Chinese urban roads is slightly above 30 kilometers per hour dur-ing rush hours. In the control cities, average speed for highways, urban expressways, arterialstreets, and sub-arterial streets are, respectively, 64, 54, 34 and 29 kilometers per hour. Exceptfor highways, all road types are equally congested. The average congestion index for China’surban roads is around 1.7 during rush hours, while the most congested 5% occassions have acongestion index above 3. The average congestion indices are surprisingly similar across alltypes of roads, suggesting that in equilibrium, there is no room for abitrage by taking a detouron smaller roads. Average speed and congestion index are also similar between the treated andthe control road segments. This is reassuring but is not essential for our identification: leveldifferences will be partialled out in the difference-in-differences estimation.

Weekly Average Residual Log Speed

In order to further reduce dimensionality, we group the speed data at the weekly level. Weare interested in changes in road speed, taking into consideration of traffic patterns by hour ofthe day and day of the week. We first run the following regression:

ln(speed)lt = λl,dow,h + ε lt, (1)

where λl,dow,h is the complete set of fully interacted terms between indicators for road segment(l), day of week (dow), and hour of the day (h). The residual from this regression, ˜ln(speed)lt =

ε lt, measures the log point deviation of hourly speed relative to the average speed on the sameroad segment, in the same hour of the day, on the same day of week. Such a saturated model isnot possible without the high-frequency data at the granular level. Despite the saturated model,

11Appendix Table A.1 lists the periods of test rides of the treated subway line. Links to data sources are alsoprovided.

10

Page 12: Subway and Road Congestion - Academic Profileecon.msu.edu/seminars/docs/Congestion_Oct2018.pdfsubway lines in the sample often requires data that cover many cities over many years,

˜ln(speed)lt may still include seasonal patterns, which we intend to purge out by comparing theresidual log speed in the treated road segments to those in the control road segments.

Hourly residual log speed is then averaged by week relative to the subway opening.

˜lnspeedlcw =1

Nw∑t∈w

˜ln(speed)lt,

where c indicates the comparing cases as defined earlier. w is the week relative to opening.Week 0 starts from the day of the line opening. Nw is the number of observations in the week.

˜lnspeedlcw should be interpreted as the weekly average log point deviation from the road seg-ment’s “usual” speed.

3 Effects of Subway on Road Speed

3.1 Baseline Specification and Results

We adopt a difference-in-differences specification that compares changes in residual logspeed in treated road segments following the opening of a new subway line to changes inspeed in control road segments. The empirical model can be written as:

˜lnspeedlcw =s

∑s=ss 6=−1

βs · Tlc · 1(period relative to subway opening = s)w + Ωlcw + ε lw, (2)

Ωlcw is a set of control variables, which in the difference-in-differences setting, always includesvariables indicating treatment status and time periods. In the baseline, Ωlcw includes Tl and λcw.Tl is a binary variable indicating whether the road segment belong to the treated group. λcw is aset of fully interacted fixed effects between indicators for the comparing case and indicators forthe week relative to opening. Because in each case, time is realigned indicate the week relativeto the date of subway line opening, λcw captures the macro trend and seasonality common toboth treated and control road segments. s is the period relative to subway line opening, it canbe weekly, monthly, or quarterly.12 We choose the period prior to subway line opening, s = −1,as the “leave-out” period, and β−1 is imposed to be 0. Standard errors are clustered at the caselevel. We have 45 cases, which is not a small number. To be conservative on inference, we alsoreport the 95% confidence intervals from the cluster wild bootstrap procedure (Cameron et al.,2008).

Key to the identification of this difference-in-differences specification is the usual paralleltrend assumption, which we can assess by testing whether βs = 0 for any s < 0. We expectβs > 0 for s ≥ 0 if subway is effective in alleviating road congestion.

Equation 2 is the specification to estimate dynamic effects. To estimate the static effect, wespecify a version of difference-in-differences model in which we group all prior periods and all

12We define every 4 weeks as a month and every 12 weeks as a quarter.

11

Page 13: Subway and Road Congestion - Academic Profileecon.msu.edu/seminars/docs/Congestion_Oct2018.pdfsubway lines in the sample often requires data that cover many cities over many years,

posterior periods, respectively:

˜lnspeedlcw = β · Tlc · postw + Ωlcw + ε lw. (3)

postw is a binary variable indicating whether the time is post the subway line opening. β mea-sures the average change in speed after the subway line opening, comparing to that in thecontrol road segments.

Figure 2 shows the results of estimating Equation 2. Dots show βs’s and spikes represent95% confidence intervals. In the first panel we estimate a coefficient for each week relative tothe time of opening. The coefficient associated with the week prior to opening (β−1) is imposedto be zero. All other coefficients should be interpreted as the effect of subway on road speedrelative to that period. Between 6 and 2 weeks prior to a subway line opening, there is nodiscernible pre-trend in speed on nearby roads relative to that in control cities. Individually,βs is not statistically different from 0 for all pre-treatment periods. The lack of differences inpre-trends between the treated and control road segments is reassuring for the identificationassumption.

Following the opening of a subway line, speed on nearby road segments increases imme-diately. In the first week after the opening, speed in treated road segments increases by 2.6%.The effect gradually increases to above 6% between 13 and 16 weeks since opening, beforedeclines and stablizes at around 4%. Therefore, the effect of subway over time seems to behump-shaped.

The second and the third panels in Figure 2 show the results where the estimates are bymonth and by season, respectively. The estimates of the effects in post periods are all positiveand statistically significant, while those associated with pre-periods are uniformly not statisti-cally different from 0. The magnitude of the effects are around 4.5% on average. The hump-shaped pattern is salient in the monthly estimates. The effect peaks at the 4th month with acoefficient of 7.1%. In fact, we can reject that the coefficient in the 4th month post opening isstatistically the same as that in the first month. We can also reject that it is the same as that inany month after 6 months post opening.

There could be several potential explanations for the hump-shaped effect over time. First,the hump shape could be driven by the changes in subway ridership. The initial increase couldbe due to the slow transition from using automobile to riding subway, and the later declinecould be due to some people switch back to using automobiles. If this is the case, we shallalso observe a hump-shaped pattern in subway ridership in the first few months post open-ing. We do not have ridership number by line at the weekly or monthly level, so it is difficultto test this hypothesis directly. For a subsample of the subway systems in their sample, Fig-ure 2 in Gendron-Carrier et al. (2018) plots the average daily ridership per 1,000 people in themetropolitan area. There does seem to be a hump shape in the first few months following a sub-way system opening. The second possible explanation is that as long as the demand elasticity

12

Page 14: Subway and Road Congestion - Academic Profileecon.msu.edu/seminars/docs/Congestion_Oct2018.pdfsubway lines in the sample often requires data that cover many cities over many years,

for automobile trips is non-zero, improved traffic conditions will incur more driving. A littledelay in responses in both the intial take up of subway trips and the later induced increases indriving would generate the hump-shaped effect on road speed. Nevertheless, the fact that thestabilized effect is positive suggests that the demand for road trip is less than one.

Table 3 reports results from estimating various versions of Equation 3, where the congestion-relieving effect of subway is consolidated into one parameter. The model in Column 1 includesa binary variable for treatment status (Tl) and a set of week-to-opening fixed effects fully inter-acted with case fixed effects (λcw), the same controls as in Figure 2. Similar to what is shown inthe graphs, a new subway line increases speed in directly affected road segments by 4.8%. Theestimate is statistically different from zero for standard errors clustered at the case level. Thepair of numbers in brackets show the 95% confidence interval from the wild residual bootstrapprocedure, which also does not include 0.

The remaining columns experiments with different controls in Ωlcw. Column 2 controls forcase fixed effects (λc) and week-to-opening fixed effects (τw) separately, making the specifica-tion somewhat less saturated. The estimate is almost identical. Column 3 makes the modelmore saturated than that in Column 1 by adding a set of road segment fixed effects (λl), againyielding an almost identical estimate. This is not surprising because the outcome variable isalready the residual from road segment average speed and the sample is a perfectly balancedpanel, so the inclusion of segment fixed effects does not affect the estimation of time fixed ef-fects.

One common concern with the difference-in-differences specification is that the treated andthe control units may be on different secular trends. With pre-treatment periods matched well,Figure 2 suggests that this is unlikely. As a robustness check, we control for time trend andallow it to differ for treated and control segments. Let w be the week relative to opening. In ad-dition to those already controlled for in Column 3, Column 4 adds the interactive term betweenthe indicators for the relative week w and the treatment dummy. We do not need to includew since we already control for the case-by-week to open fixed effects. The resulting coefficientis 5.7%. Columns 5 and 6 controls for treatment-specific polynomials in time trends up to the3rd and 5th power, respectively. The estimated coefficients are 3.2% and 3.0% respectively, andremain statistically significant.

3.2 Robustness Checks

Placebo Test

To further rule out the possibility that our results are driven by some confounding timetrend, we test whether the timing of the effect coincides with subway opening. To do that, weinclude in each comparing case weekly observations that are up to 48 weeks away from thesubway line opening. We repeat estimating Equation 3 many times, each with a placebo sub-way opening date between 48 weeks prior and 48 weeks posterior to the actual opening date.

13

Page 15: Subway and Road Congestion - Academic Profileecon.msu.edu/seminars/docs/Congestion_Oct2018.pdfsubway lines in the sample often requires data that cover many cities over many years,

The Wald statistics associated with β from these regressions are then plotted against the placeboweek of opening relative to the actual week of opening. If the results are truly due to subwayopening instead of some confounding factors, the largest Wald statistic should be found aroundthe actual time of subway opening. Figure 3 shows that, indeed, the Wald statistic peaks aroundthe actual time of subway openings.13

Longer Pre-periods

We restrict the sample to be within 6 weeks prior to a subway line opening with the concernthat subway construction can directly affect road congestion. Within the 6-week window of thepre-treatment period, all major constructions should have concluded and traffic on the groundshould be back to “normal”. Nevertheless, we can also purge out this “abnormality” in trafficdue to construction by including flexible time trends separately for the treated and the controlsegments. We can include earlier periods and test whether our results remain robust.

We extend the sample to include up to 48 weeks before the opening and repeat the sameestimations as in Table 3. Panel A of Table 4 reports the results. In Columns 1 through 3, as wedo not control for time trend by treatment status, we get an effect between 3.1% and 3.9%. Oncewe control for flexible time trends, estimated coefficients in Columns 4 through 6 are around5%, similar to the baseline result.

Sub-sample by Time of Opening

Our data cover the 18 months between August 2016 and January 2018. The opening datesof the treated lines spread across this period. Therefore, although we restrict the observationsfrom each treated line to be within 6 weeks before and 18 weeks after the opening, the numberof periods in the sample differs across comparing cases. In the estimation of dynamic effects(Equation 2), identification of the effect in each period relative to the opening date comes from adifferent mix of cases. This may cause challenges in the interpretation of our results dependingon the dynamism and homogeneity of the effects across treated lines. There are three scenarios.First, the effect may be homogeneous across all lines but is dynamic over time. This is the casewhere the effect changes over time but all treated cases follow an identical trajectory. Becausetreated lines vary in the number of periods included in the sample, β from Equation 3 is theweighted average of the effects over different periods, where the weight is the number of ob-servations in each period. Thus β cannot be simply interpreted as the average post-treatmenteffect. Second, the effect may be static and does not change over time (there is no dynamiceffect), but is heterogeneous across different lines. β from Equation 3 is the weighted averageof the effects over different lines, where the weight is the number of observaitons each line has

13The Wald statistic reaches its peak if we define the subway opening time 2 weeks after the actual openingdates. This is because the effect of subway is relatively small in the first few weeks, as is shown in the first panelof Figure 2.

14

Page 16: Subway and Road Congestion - Academic Profileecon.msu.edu/seminars/docs/Congestion_Oct2018.pdfsubway lines in the sample often requires data that cover many cities over many years,

in the sample. However, for the dynamic specification in Equation 2, even though there is nodynamic effects, because the identification of different periods come from different lines, whichhave different static effects, the estimated βs will differ by s. In other words, we will get supri-ous dynamic effects due to changing mixtures of different lines over time. Finally, when theeffects are both heterogeneous and dynamic, estimates from neither Equation 2 nor Equation 3has intuitive interpretation.

To address this issue, we divide the treated lines into two sub-samples: the “early” lines andthe “late” lines, depending on whether the opening date of the subway line is before or afterFebruary 28, 2017. For the early lines, we include observations that are within 6 weeks beforeand 48 weeks after the line opening. So our data cover all these periods for these lines.14 Forthe late lines, we include observations that are within 36 weeks before and 4 weeks after theline opening (our sample ends on January 31, 2018). We estimate Equations 2 and 3 separatelyfor each sub-sample. Now in the dynamic specification, each βs is estimated using informationfrom the same set of subway lines. Because we have a strictly balanced sample, the weight fromeach line in different periods is the same. In the static specification, β is the weighted averageof the effects from each line, where weight is the number of observations in each case.

Panel B of Table 4 shows the results of estimating various versions of Equation 3 with thesubsample of early lines. The average effect of this group of subway lines on nearby road speedover a period of 48 weeks is between 4-5%. Panel C of Table 4 shows that for the subsampleof late lines, the effect is between 2-3%. The smaller coefficient for the late group is expected,because only 4 weeks post-opening are observable for this group, and we know that the effectrises gradually in the first few months.

Figure 4 shows the estimations of Equation 2, separately for the early lines and the late lines.The graphs here show only estimations at the weekly level. Those at monthly and quarterly lev-els are availble in Appendix Figure A.6. For the group of subway lines that opened relativelyearly, there is still a hump-shaped effect over time. The hump-shaped pattern is somewhatstronger. Because the coefficient for each period is estimated using the same set of lines and thesame weights, changing composition of lines with heterogeneous effects cannot explain thishump shape. The effect peaks at a level of about 8% 12 weeks after the opening and stablizes atslightly below 5% from the 16th week. For the group of subway lines that opened relatively late,we find that the coefficient associated with every single pre-treatment period up to 36 weekspost opening is small and not statistically significant from zero. There is a positive effect onroad speed immediately following the line opening and the effect keeps increasing for the first4 weeks. We are only able to track 4 weeks post opening for all subway lines in the group. Theaverage effect is about 4%. In fact, for the first 4 weeks post-opening, the dynamic effect fromthe late lines tracks well with that from the early lines.

14We exclude Tianjin Line 6 (opened on August 6, 2016) and Zhengzhou Line 2 (opened on August 19, 2016)because our data cover less than 6 weeks before their openings.

15

Page 17: Subway and Road Congestion - Academic Profileecon.msu.edu/seminars/docs/Congestion_Oct2018.pdfsubway lines in the sample often requires data that cover many cities over many years,

3.3 Alternative Specifications

China’s subway construction boom and our high-frequency speed data that cover multi-ple cities allow us to adopt a difference-in-differences approach, which we argue is crucial foridentification. Most existing studies in the literature have relied on an event study approachwhich usually looks at a single event (Chen & Whalley, 2012; Anderson, 2014; Yang et al., 2018).Gendron-Carrier et al. (2018) use multiple subway openings and estimate a two-way fixed ef-fect model with unit and time fixed effects. This can be seen as a generalized event study thatexploits variation in treatment time to estimate time fixed effects. Yet it is still different from ourdifference-in-differences approach because every unit in their sample gets treated at some point.It is instructive to compare our baseline specification to those alternative econometric models.

Event Study

The event study approach essentially compares the outcome before and after the policychange. One can control for flexible time trends and identify a discountinuous change in out-come in the neighborhood of policy change. This approach is sometimes also called the regres-sion discontinuity design using time as the running variable (Imbens & Lemieux, 2008; Lee &Lemieux, 2010). One crucial limitation of such an empirical design is that it cannot distinguishthe treatment effect of the plicy change from other time-varying confounding factors, includingseasonality and macro trends. In our case, most of the subway line openings are concentratedbefore China’s holiday season, and our data show that urban roads tend to become less con-gested as the holiday season approachs. A naive event study will mistakenly attribute theseasonal increase in speed to the causal effect of subway openings.

Figure 5 illustrates this point. From the baseline sample, we plot weekly average residuallog speed separately for the treated and control road segments. The weeks are re-aligned rela-tive to the week of subway line opening so we can take averages across all cases (first averagedwithin each subway line, then average across 45 cases). The red dots show the weekly averagesof residual log speed from treated road segments, the teal crosses show the weekly averagesfrom control road segments. The dashed lines are Lowess non-parametric smooth fits, sepa-rately for the treated (in red) and for the control (in teal) and separately for before and aftersubway line openings.

Speed on treated road segments shows a clear jump by about 5% upon the time of a subwayline opening. However, this could be misleading because the pattern could also be driven byseasonality. Speed on control road segments exhibit a similar, despite smaller, jump of about 1%around the time of subway line opening in treated cities. A comparison of the non-parametricsmoothing lines for the treated and control road segments suggests that there is indeed somespeed-enhancing effect of subway line opening, but the true effect is smaller than what a simpleevent study using only the treated segments would suggest.

16

Page 18: Subway and Road Congestion - Academic Profileecon.msu.edu/seminars/docs/Congestion_Oct2018.pdfsubway lines in the sample often requires data that cover many cities over many years,

A formal regression discontinuity design where the running variable is the time relative tosubway opening can be specified as following:

˜lnspeedlw = β0 + β1 · postlw + β2 · postlw · w + f (w) + ε lw, (4)

where ˜lnspeedlw is the residual log speed from a full set of fixed effects with indicators for roadsegment (l) fully interacted with indicators for week to opening (w). postlw is a binary variablewhich takes value 1 for weeks on or after the subway opening. w is the week relative to thesubway opening, with w = 0 for the week when the subway line opens. f (w) is a flexiblefunction of w, which is taken as up to 5th polynomial of w. ε lw is the error term. β1 captures thediscontinuity in levels of average log speed as subway line opens. β2 captures possible trendbreak after the subway line opening. We vary the bandwidth and up to 48 weeks before andafter the subway line opening are used. We split the sample into treated-only and control-onlysub-samples and estimate Equation 4 separately on each sub-sample.

Panel A of Table 5 presents the results from regression discontinuity design on the treatedroad segments. Column 1 uses the baseline sample. It includes a linear time trend, f (w) = w.The result shows that the average speed on treated road segments increases by 6.3% after thesubway line opens. Panel B presents the regression-discontinuity results from the control roadsegments. With the same specification as in the corresponding column in Panel A, it finds an“effect” of about 1.4%. Therefore, an event study approach relying only on the treated linesover-estimates the true effect. It is worth noting that subtracting the estimated coefficient inPanel B from that in Panel A (4.9%) leads to an effect that is similar to the baseline (corespond-ing to Column 4 in Table 3). As higher-order polynomials are added in Columns 2 and 3, theestimates for both treated and control segments become smaller. In particular, the coefficientsin Panel B becomes only marginally significant. The differences between the coefficients in thetwo panels remain fairly close to that in the baseline (Columns 5 and 6 in Table 3). Column 4further allows a linear time trend break after the line opening. But the trend break is similar forboth the treated and the control road segments. Columns 5 to 8 expand the sample period to48 weeks before and 48 weeks after the subway line opening. For both treated and control lineswe get slightly larger effects. Nevertheless, throughout these columns, the effect for the treatedsegments is larger than that for control segments by between 4.3% and 5%, which is close to theresults from baseline difference-in-differences estimates.

Two-way Fixed Effects Model

The baseline model restricts the comparison to be between the treated and the control seg-ments at the same calendar time by defining a time variable that is relative to the time of sub-way line opening. With time variation in the opening of subway lines, we can also estimate thestandard two-way fixed effect model:

˜lnspeedlt = β · Tl · postlt + λl + τt + ε lt, (5)

17

Page 19: Subway and Road Congestion - Academic Profileecon.msu.edu/seminars/docs/Congestion_Oct2018.pdfsubway lines in the sample often requires data that cover many cities over many years,

where l indicates road segment, t indicates calendar week. Tl = 1 if the road segment is treated.postlt = 1 if the week is after the subway openning. λl is a set of road segment fixed effects, τt

is a set of calendar week fixed effects. ε lt is the error term, which we cluster at the subway linelevel.

A dynamic version can be written as

˜lnspeedlt =s

∑s=ss 6=−1

βs · Tl · 1week rel. to opening=slt + λl + τt + ε lt, (6)

where s indicates the week relative to the subway line opening. We include observations fromthe treated lines that are 6 weeks prior to the subway opening and 48 weeks after the opening.For control road segments, we include observations from all weeks and set the week relative toopening equal to -99.

The inclusion of control road segments helps estimate the calendar week fixed effects, whichis essential to partial out seasonality and common time trend. Notice that this specification isless saturated than the baseline specification, which uses road segments that are never treatedduring the sample period to control for seasonality and common time trend. In fact, the two-way fixed effect model can be estimated without control road segments. In this case, variationin opening dates among treated road segments helps the estimation of calendar week fixedeffects. One issue is that towards the end of the sample period, as more and more lines gettreated, the variation used to estimate the later time fixed effects gets thinner, resulting in lessprecise estimates. We estimate Equations 5 and 6 with and without control road segments.

One can think this approach as a generalized difference-in-differences approach, where afterpartialing out calendar time fixed effects, observations from units that are treated later serve ascontrol for those from units that are treated earlier. One remaining issue with this identificationstrategy is that, as units gradually get treated over time, control units are constantly changing.Abraham & Sun (2018) show that estimate from such event studies with staggered adoption isa non-convex average of the cohort-specific treatment effects and is not causally interpretablewhen there are the heterogeneity in treatment effects. They propose an interaction-weightedestimator. This approach separately estimates causal effects for each cohort by the time of treat-ment. These cohort-specific treatment effects are then averaged using the share of observationsin the corresponding relative time period as weight.

Table 6 reports the results from estimating Equations 5. Columns 1 and 2 use only treatedsegments while Columns 3 and 4 use both treated and control segments. Columns 1 and 3 areunweighted while Columns 2 and 4 use weights suggested by Abraham & Sun (2018). The re-sults are surprisingly similar across all specifications. The estimated coefficients, between 4.3%and 4.8%, are also strikingly close to the baseline result. Figure 6 shows the corresponding re-sults from estimating the dynamic specification in Equation 6. One thing to notice is that astime moves on and more and more units get treated, we have fewer and fewer control units. So

18

Page 20: Subway and Road Congestion - Academic Profileecon.msu.edu/seminars/docs/Congestion_Oct2018.pdfsubway lines in the sample often requires data that cover many cities over many years,

the coefficients associated with later periods are less precisely estimated. We still see a gradualincrease in the effect in the first few weeks, but due to noisy estimates for the later periods, wedo not see a clear hump shape. Nevertheless, all four specifications suggest that subway lineopening increases nearby road speed by about 5%.

Effects by Individual Lines

One reason for the striking similarity in two-way fixed effect models with and withoutweight adjustment is that the effects are largely homogeneous. We can estimate the effect sep-arately by each treated subway line using the baseline specification:

˜lnspeedlcw = β · Tlc · postw + λcw + ε lw, (7)

Only one case (road segments near a treated subway line and the corresponding control roadsegments) is included in each regression. Standard errors are clustered at the subway linelevel.15 The graph on the left in Figure 7 reports the results of these estimates. Because newsubway lines opened in different times during the sample period, with dynamic effects overtime, the postw dummy captures different periods relative to the opening dates for differentlines. To avoid attributing dynamics to heterogeneous effects, we also estimate versions ofthe model in which we restrict the sample to be between 6 weeks prior and 4 weeks after theopening. The graph on the right in Figure 7 reports the results. 39 out of 45 cases have a positiveeffect and most estimates are bounded between 0 and 10%.

3.4 Herterogeneous Effects by Segment Characteristics

The treated road segments in the baseline results are those that are close substitutes to thenew subway lines. Arguably this is the group of road segments for which the congestion-relieving effect of subway is the largest. Instead of focusing on a subset of roads, a more in-teresting question is how much subway reduces overall road congestion. There is at least oneconstraint and one caveat to properly answer this question. The constraint is that we do nothave speed data on all, or a random sample of, the road segments in the city. The caveat is thatthe effect of a subway line on the overall congestion in the city depends on the size of the cityand the location of the subway. A couple of subway lines may be sufficient for a medium-sizedcity. China’s largest cities, such as Beijing and Shanghai, have more than 20 million residentsand already have over a dozen subway lines.

In light of these two observations, we approach the question by estimating heterogeneouseffects on road segments with different characteristics. With these heterogeneities at hand, we

15When the number of treated groups is small (in this case, there is only one), the difference-in-differencesestimator tends to over-reject the null. Ferman & Pinto (in press) suggest a bootstrap approach to adjust for thep−value. Appendix Figure A.7 shows that the 95% confidence intervals using their proposed bootstrap approach.Under these confidence intervals, none of the invidual estimates are statistically significant.

19

Page 21: Subway and Road Congestion - Academic Profileecon.msu.edu/seminars/docs/Congestion_Oct2018.pdfsubway lines in the sample often requires data that cover many cities over many years,

are able to say something about the effect of a typical subway line in a city of a certain size andwith a hypothetical road network.

To do that, we include all road segments we have in the sample.16 To study the effect atthe city level, we also group subway lines that opened within the same city-month and con-solidate comparing cases. In each consolidated case with multiple line openings, the date ofthe first subway line opening is assigned to be the opening date for the consolidated case. Weconsolidate 45 line openings into 35 city-level cases.

We first investigate how the effect differs by the road segment’s geographic relation to thetreated subway line. The data allow us to investigate the heterogeneous effects along three maindimensions. First, whether the road segment is directly affected by the treated subway. Second,road segment’s distances to the treated subway and existing subway. Third, whether the roadsegment is largely parallel or orthogonal to the treated subway line. When there are multipletreated subway lines in the sample, we measure a road segment’s relation to the nearest treatedsubway line.

Column 1 of Table 7 shows that on average a subway line opening increases speed on roadsegments in our sample by 2.5%. This average effect does not tranlate into the effect at thecity level because our sample covers non-random patches of the city. Column 2 shows thatindeed the effect on road segments we select for the baseline are larger than that on the sampleaverage. Compared with other road segments in the sample, speed increase in these directly-affected segments is another 1.5 percentage points higher.

Columns 3 and 4 show that the congestion-reduction effect diminishes quickly as we movefarther away from the treated subway. The coefficient in Column 3 suggests that the effect dropsby about 1 percentage point as the distance between the road segment and the subway doubles.Using a non-parametric specification, Column 4 shows that a new subway line increases speedon road segments within 1 km by 4.1%, but the effect reduces to around 1.6%-2.6% for thosethat are more than 1 km away.

Column 5 adds to the specification in Column 3 by including an interactive term betweenthe treatment and the log distance to the nearest existing subway line. This intends to capturethe network effect of the subway system: whether the opening of a new subway line improvesthe capacity of the whole subway system and relieves traffic on roads near existing subwaylines. We find suggestive evidence for the existence of such a network effect. Speed on roadsnear existing subway lines also increases, although the coefficient is not precisely estimated.

Column 6 shows that the effects are concentrated in road segments that are initially morecongested, where the initial congestion level is defined as the average congestion index in the 6weeks prior to the line opening. Columns 7 and 8 show that road segments that are parallel tothe subway see a larger increase in speed, comparing to those that are orthogonal to the subwayline. This difference also diminishes quickly by distance.

16Except for local streets, which are still dropped from the sample.

20

Page 22: Subway and Road Congestion - Academic Profileecon.msu.edu/seminars/docs/Congestion_Oct2018.pdfsubway lines in the sample often requires data that cover many cities over many years,

The remaining two columns investigate the effect by road categories. Because there are fewhighway segments, we group them with urban expressways. Column 9 suggests that the sub-way has little effect on alleviating traffic on highways and urban expressways. This is probablybecause most trips that take highways and urban expressways in Chinese cities are relativelylong distance and are not substitutable by subway trips. The traffic-relieving effect is the samein both arterial and sub-arterial roads, and the effect decline by distance at the similar rate.

4 Evidence on Substitution betweeen Modes of TransportationLess congested roads mean less traffic on the road. It would be instructive to investigate

how subway trips substitute other modes of transportation. Ridership in the new subway linecould come from induced new trips, which is a response to increased convenience in travelling.Subway ridership could also come from diverted trips from other public transit (mainly buses)or driving (private cars). Diverting from driving is likely to generate significant reductions intraffic, as private cars take up large amount of space per passenger. Bus rides takes much lessroad space per passenger. But buses move slowly and make frequent stops, this may generatelarge negative externalities on other vehicles on the road. We study the substitution patternsbetween subway trips and other modes of transportation using data from the city level and thehousehold level.

Information on transportation infrastructure and usage at the city level is from StatisticalYearbooks of Chinese cities, compiled and published by the China’s National Bureau of Statis-tics. These yearbooks report key statistics on urban transportation. Although variables in-cluded in these publications vary by year and city, annual subway ridership, bus ridership, andcar ownership are available for most major cities in recent years. We use a panel data between2010 and 2017 from 36 cities. We investigate simple correlations between subway ridership andvolumes of alternative modes of transportation while controlling for year and city fixed effects.Column 1 of Table 8 shows a clear substitution between subway and bus trips: as the numberof subway trips increases by 100, the number of bus trips declines by 22. Increases in subwayridership also reduce the service mileage of buses. Column 2 shows that the total bus mileagereduces by 5 kilometers for every additional 100 subway trips, although the estimate is not sta-tistically significant at any conventional level. Admittedly, the substitution between subwayand bus ridership is not necessarily a result of individual choices. It could be due to changesin the city’s provision of public transportation. That is, when a new subway line opens, thecity government may reduce bus services on routes that overlap with the new subway services.There is also a clear negative correlation betweeen subway ridership and the average lengthof bus trips. It is possible that with the expansion of the subway system, bus services are re-routed to mainly solve the last-mile problem of transporting passengers to subway stations.Column 3 shows that there is some evidence that increases in subway trips are associated withreduced number of registered civil-use cars (excluding trucks and commerial use cars). The co-

21

Page 23: Subway and Road Congestion - Academic Profileecon.msu.edu/seminars/docs/Congestion_Oct2018.pdfsubway lines in the sample often requires data that cover many cities over many years,

efficient is not statistically significant. The magnitude of the coefficient nevertheless indicatesthat 5,000 subway trips a year translates into one fewer car. This suggests a moderate substitu-tion between subway trips and car ownership. The statistical yearbooks do not report vehiclekilometers travelled from private cars, so we cannot look at car uses.

Information on individual-level trips is from one-day travel diaries, which are part of theHousehold Travel Surveys in Beijing. We obtain two rounds of surveys in 2010 and 2015. Thefinest identifiable geographic level in the survey is a Transportation Analysis Zone (TAZ). In2010, Beijing is divided into about 2,000 TAZs. TAZs are very small geographic areas. Withinthe fourth ring road, where the density of subway lines is the highest, the median TAZ has anarea of 1.3 square kilometers. We measure the change in TAZ’s access to the subway between2010 and 2015 by the change in the length of subway lines that cuts through the the TAZ. Thetravel diaries record the mode of transportation of each trip. Conditional on the group of re-spondents who have a trip on the day of survey, we calculate the share of trips that use differentmodes of transportation. Trips that only use walking or cycling are also excluded. We calcu-late shares of trips that use buses, subways, and cars. Other modes of transportation includesmotorcycles and mopeds. Notice that an individual can use multiple modes of transportationin one trip. So the sum of the shares can exceed 1. We then estimate the correlation betweenchanges in TAZ’s access to subway and changes in the proportion of residents who take a cer-tain mode of transportation. Because the surveys are repeated cross-sectional, we control fordetailed household and individual characteristics to reduce the risk of false correlation due tochanges in the composition of respondents.

Table 9 reports these correlations. Panel A includes all individuals that had a trip. 16%of the individuals use subway, 41% use bus, and 45% use car. Improved access to subway ispositively associated with subway trips and negatively correlated with bus and car trips. Thecoefficients show that 4 additional subway trips are associated with 2 fewer bus trips, and 3additional subway trips are associated with 1 fewer car trip. Households that value subwaytrips more may move to neighborhoods that are expecting to have improved access to subway.Although it is hard to identify causal effects using the current data, we can alleviate concernswith false correlation. Panel B includes only households that had not moved since 2009. Theresults are essentially unchanged. To investigate the driving behavior in more detail, Panel Cfurther restricts the sample to households that had a car in 2009.17 For this smaller sample, ac-cess to subway is associated with more subway trips and reduced bus trips (large coefficient butnot statistically significant), yet there is no evidence of reduced car trips. To sum up, city and

17The 2015 survey asks the age of each vehicle. We include all households in the 2010 sruvey that owned acar and households in the 2015 survey that owned a car that was made in or before 2010. This imputation is notperfect. Households that owned a car in 2010 but replaced it with a new car between 2010 and 2015 will be mis-takenly excluded from the sample. First-time car owners after 2010 who bought a second-hand car made before2010 will be mistakenly included in the sample. The lower share of car trips in this sample compared with thosein Panel A and Panel B is probably due to the fact that the sample here is more skewed towards the 2010 survey.

22

Page 24: Subway and Road Congestion - Academic Profileecon.msu.edu/seminars/docs/Congestion_Oct2018.pdfsubway lines in the sample often requires data that cover many cities over many years,

household level evidence shows that subway reduces both bus and car trips, with a substantialmargin coming from reduced bus trips.

5 Value of Reduced CongestionSubway lines cost billions of dollars to construct. They are also significantly more expensive

to operate than buses or streetcars. Whether subway lines create benefits large enough to justifytheir high costs is an important policy question. It is also a difficult question because benefitscome from multiple sources over many years. These benefits include, and are not restricted to,lower time cost for commuters who remain driving; lower time cost for commuters who switchfrom other modes of transportation (such as bus) to subway; monetary savings (or losses) com-pared with using costs of other modes of transportation; value from more trips induced byimproved connectness between locations; increases in property values; as well as other benefitssuch as reductions in air and noise pollution. Evaluating all these sources of benefits is beyondthe scope of this paper.18

This paper has been focusing on the identification of the congestion-reduction effect of sub-way on nearby roads, so here we concentrate our focus on the benefits from reduced trafficcongestion. We use Beijing as an example because it offers the best data among all cities in thesample. Suppose the average road is about 2 kilometers away from the nearest subway line,which according to our estimates would suggest that subway increases speed on the averageroad by 2%. In 2016, the average commuter in Beijing spent 56 minutes one way on a typicalwork day. Let us assume that this number applies to the average driver as well. A 2% increasein average speed saves 1.06 minutes one way and 2.12 minutes per day. Average annual wage inBeijing was 92,456 yuan (13,320 USD) in 2016. Assuming that a typical full-time worker works2,000 hours a year (250 working days, 8 hours per day). This translates into an average wageper minute of 0.77 yuan (0.12 USD). Suppose the monetary value of time in commuting is halfthat amount. Beijing has 5.7 million people commute by car every day. So the monetized valueof saved time is about 1.2 billion yuan or 179 million USD.19 Beijing has the buesiest subwaysystem in China and is one of the country’s most congested cities. One would expect that asubway line’s benefit in other cities is only likely to be smaller.

The cost of subway consists of construction cost and operational cost. Yang et al. (2018) es-timates that the construction cost of Beijing subway is 92 million US dollars per kilometer. Thesystem is now 600 kilometers long. Assume that it is designed to last 100 years, so the con-struction cost averaged into every year is 552 million US dollars. On the operational side, it isreported that government subsidies account for 50% of Beijing’s subway system’s operational

18Recent studies use general equilibrium models to capture many sources of benefits of urban transit over avery long period of time (e.g., Tsivanidis, 2018; Heblich et al., 2018)

190.77/2 (unit time value)×1.06 (time saved)×2 (round trip)×250 (number of working days a year)×5.7 million(number of commuters)

23

Page 25: Subway and Road Congestion - Academic Profileecon.msu.edu/seminars/docs/Congestion_Oct2018.pdfsubway lines in the sample often requires data that cover many cities over many years,

cost.20 Beijing subway system had 3.78 billion rides in 2017. So the subsidy is 7.56 billion yuan(1.16 billion USD). This simple calculation shows that value from saved commuting time fordrivers is a small fraction of the cost.

6 ConclusionsThis paper studies the effect of subway on road congestion. We use crowd-sourced big data

on road speed, which is new to the literature. The use of high-frequency big data combinedwith a setting with many subway line openings allows us to adopt a saturated econometicmodel that credibly identify the causal impacts. We compare different econometric models tohighlight potential confounding factors.

We find that the opening of a new subway line immediately and significantly increasesspeed on nearby roads. The dynamic effect of subway opening exihibits a hump shape: thecongestion-reducing effect of the subway increases in the first few weeks before it graduallydeclines. About 16 weeks after the line opening, the effect stabalizes at about 5%. The effect ismost concentrated in road segments that were initially congested and the effect declines quicklyby distance.

We show corroborating evidence from city and household level travel data that the im-proved traffic comes from reduced road traffic. In particular, there is strong substitution be-tween subway and bus trips. Although we cannot precisely decompose the sources of reducedvolume of road traffic, we suspect that a substantial portion comes from reduced bus ridership.

Simple back-of-the-envelope calculations show that monetized benefit of time saved fromreduced traffic is only a small fraction of subway’s construction and operation costs. However,in order to pin down the welfare impact of subway one needs to take many factors into con-sideration and probably adopt a general equilibrium approach. It is of great interest for futureresearch.

ReferencesAbraham, S., & Sun, L. (2018). Estimating Dynamic Treatment Effects in Event Studies with Hetero-

geneous Treatment Effects (Tech. Rep.).Adler, M. W., & van Ommeren, J. N. (2016). Does public transit reduce car travel externalities?

Quasi-natural experiments’ evidence from transit strikes. Journal of Urban Economics, 92, 106-119.

Akbar, A. P., & Duranton, G. (2017). Measuring the cost of congestion in a highly congested city:Bogotá (Tech. Rep.).

20This number was before the fare rise in 2014 from a flat rate of 2 yuan per ride to a distance-based pricingsystem that costs between 3 and 9 yuan per ride.

24

Page 26: Subway and Road Congestion - Academic Profileecon.msu.edu/seminars/docs/Congestion_Oct2018.pdfsubway lines in the sample often requires data that cover many cities over many years,

Anderson, M. L. (2014). Subways, strikes, and slowdowns: The impacts of public transit ontraffic congestion. American Economic Review, 104(9), 2763–96.

Baum-Snow, N., & Kahn, M. E. (2005). Effects of Urban Rail Transit Expansions: Evidence fromSixteen Cities, 1970-2000. Brookings-Wharton Papers on Urban Affairs, 147-206.

Cameron, A. C., Gelbach, J. B., & Miller, D. L. (2008). Bootstrap-based improvements for infer-ence with clustered errors. Review of Economics and Statistics, 90(3), 414-427.

Chen, Y., & Whalley, A. (2012). Green infrastructure: The effects of urban rail transit on airquality. American Economic Journal: Economic Policy, 4(1), 58–97.

Couture, V., Duranton, G., & Turner, M. A. (2018). Speed. Review of Economics and Statistics.Davis, L. W. (2008). The effect of driving restrictions on air quality in mexico city. Journal of

Political Economy, 116(1), 38–81.Downs, A. (1962). The law of peak-hour expressway congestion. Traffic Quarterly, 16(3).Downs, A. (2000). Stuck in traffic: Coping with peak-hour traffic congestion. Brookings Institution

Press.Duranton, G., & Turner, M. A. (2011). The fundamental law of road congestion: Evidence from

us cities. American Economic Review, 101(6), 2616–52.Ferman, B., & Pinto, C. (in press). Inference in differences-in-differences with few treated

groups and heteroskedasticity. Review of Economics and Statistics.Gendron-Carrier, N., Gonzalez-Navarro, M., Polloni, S., & Turner, M. A. (2018). Subways and

urban air pollution (Tech. Rep.). National Bureau of Economic Research.Gonzalez-Navarro, M., & Turner, M. A. (2018). Subways and urban growth: Evidence from

earth. Working paper, Brown University.Gu, Y., Deakin, E., & Long, Y. (2017). The effects of driving restrictions on travel behavior

evidence from beijing. Journal of Urban Economics, 102, 106–122.Heblich, S., Redding, S. J., & Sturm, D. M. (2018). The Making of the Modern Metropolis: Evidence

from London (Tech. Rep.).Hsu, W.-T., & Zhang, H. (2014). The fundamental law of highway congestion revisited: Evi-

dence from national expressways in japan. Journal of Urban Economics, 81, 65–76.Imbens, G., & Lemieux, T. (2008). Regression Discontinuity Design: A Guide to Practice. Journal

of Econometrics, 142(2), 615-635.Lee, D. s., & Lemieux, T. (2010). Regression Discontinuity Designs in Economics. Journal of

Economic Literature, 48, 281-355.Li, S. (in press). Better lucky than rich? welfare analysis of automobile license allocations in

beijing and shanghai. Review of Economic Studies.Parry, I. W. H., & Small, K. A. (2009). Should Urban Transit Subsidies Be Reduced? American

Economic Review, 99(3), 700-724.Severen, C. (2018). Commuting, Labor, and Housing Market Effects of Mass Transportation: Welfare

and Identification (Tech. Rep.). Federal Reserve Bank of Philadelphia Working Paper WP 18-

25

Page 27: Subway and Road Congestion - Academic Profileecon.msu.edu/seminars/docs/Congestion_Oct2018.pdfsubway lines in the sample often requires data that cover many cities over many years,

14.Tsivanidis, N. (2018). The Aggregate and Distributional Effects of Urban Transit Infrastructure:

Evidence from Bogotá’s TransMilenio (Tech. Rep.).Voith, R. (1991). The long-run elasticity of demand for commuter rail transportation. Journal of

Urban Economics, 30(3), 360–372.Winston, C., & Langer, A. (2006). The effect of government highway spending on road users’

congestion costs. Journal of urban Economics, 60(3), 463–483.Winston, C., & Maheshri, V. (2007). On the social desirability of urban rail transit systems.

Journal of urban economics, 62(2), 362–382.Yang, J., Chen, S., Qin, P., Lu, F., & Liu, A. A. (2018). The effect of subway expansions on vehicle

congestion: Evidence from beijing. Journal of Environmental Economics and Management, 88,114–133.

Figures and Tables

Figure 1: Subway Construction in China

010

0020

0030

0040

0050

00to

tal l

engt

h (k

m)

050

0010

000

1500

0to

tal r

ider

ship

(mill

ion)

2000 2005 2010 2015 2020year

ridership length

Note: Data from Statistical Yearbooks of Cities, published annually by China’s National Bureau of Statistics.

26

Page 28: Subway and Road Congestion - Academic Profileecon.msu.edu/seminars/docs/Congestion_Oct2018.pdfsubway lines in the sample often requires data that cover many cities over many years,

Figure 2: Effects of Subway on the Speed of Directly Affected Road Segments

-.05

0.0

5.1

coef

f.

-8 -4 0 4 8 12 16 20 24 28 32 36 40 44 48weeks to subway opening

weekly

-.05

0.0

5.1

coef

f.

-8 -4 0 4 8 12 16 20 24 28 32 36 40 44 48weeks to subway opening

weekly monthly

-.05

0.0

5.1

coef

f.

-8 -4 0 4 8 12 16 20 24 28 32 36 40 44 48weeks to subway opening

weekly monthly quarterly

Notes: Each mark represents a βs in Equation 2. The spikes represent the 95% confidence intervals.

27

Page 29: Subway and Road Congestion - Academic Profileecon.msu.edu/seminars/docs/Congestion_Oct2018.pdfsubway lines in the sample often requires data that cover many cities over many years,

Figure 3: Wald Statistics from Estimates with Placebo Opening Dates

01

23

4W

ald

stat

istic

-48 -44 -40 -36 -32 -28 -24 -20 -16 -12 -8 -4 0 4 8 12 16 20 24 28 32 36 40 44 48

week relative to opening

Notes: Plot of Wald statistics for tests of β at various placebo weeks of subway opening, relative to the true weekof opening.

28

Page 30: Subway and Road Congestion - Academic Profileecon.msu.edu/seminars/docs/Congestion_Oct2018.pdfsubway lines in the sample often requires data that cover many cities over many years,

Figure 4: Dynamic Effects for Subgroups

-.05

0.0

5.1

.15

coef

f.

-8 -4 0 4 8 12 16 20 24 28 32 36 40 44 48weeks to subway opening

weekly

Subway Lines Opened before Feb 28, 2017

-.05

0.0

5.1

coef

f.

-36 -32 -28 -24 -20 -16 -12 -8 -4 0 4

weeks to subway opening

weekly

Subway Lines Opened after Feb 28, 2017Note: The sample in the first graph includes cases of subway lines that opened before February 28, 2017.Observations up to 6 weeks prior to and 48 weeks after the subway line openings are included. The sample in thesecond graph includes cases of subway lines that opened after February 28, 2017. Observations up to 36 weeksprior to and 4 weeks after the subway line opening are included. Appendix Figure A.6 show results fromdynamic effects for both subgroups at different aggregations of time periods.

29

Page 31: Subway and Road Congestion - Academic Profileecon.msu.edu/seminars/docs/Congestion_Oct2018.pdfsubway lines in the sample often requires data that cover many cities over many years,

Figure 5: Event Study

-.04

-.02

0

.02

.04

.06

.08

mea

n re

sidu

al lo

g sp

eed

-8 -4 0 4 8 12 16 20 24 28 32 36 40 44 48week to subway opening

treated control

Note: Each circle represents the average weekly residual log speed in treated road segments. Each crossrepresents the corresponding value from control road segments. The averages are first taken at thecase--treatment status--week level, then taken across all cases.

30

Page 32: Subway and Road Congestion - Academic Profileecon.msu.edu/seminars/docs/Congestion_Oct2018.pdfsubway lines in the sample often requires data that cover many cities over many years,

Figure 6: Difference-in-Differences with Calendar Time

Panel A: treated lines only, unweighted Panel B: treated lines only, weighted

-.05

0.0

5.1

.15

coef

f.

-8 -4 0 4 8 12 16 20 24 28 32 36 40 44 48weeks to subway opening

weekly

-.05

0.0

5.1

.15

coef

f.

-8 -4 0 4 8 12 16 20 24 28 32 36 40 44 48weeks to subway opening

weekly

Panel C: treated and control lines, unweighted Panel D: treated and control lines, weighted

-.05

0.0

5.1

.15

coef

f.

-8 -4 0 4 8 12 16 20 24 28 32 36 40 44 48weeks to subway opening

weekly

-.05

0.0

5.1

.15

coef

f.

-8 -4 0 4 8 12 16 20 24 28 32 36 40 44 48weeks to subway opening

weekly

Note: Panel A and Panel B are from weekly-level estimations of Equation 6 using the treated lines only. Panel A isunweighted. Panel B regulates the weight in each period using the interaction-weighted estimator proposed byAbraham & Sun (2018). Panel C and Panel D include both treated and control lines. Panel C is unweighted. PanelD uses the number of observations in the subway line and week-to-opening as share of total number ofobservations in the responding week-to-opening as weight. Standard errors are clustered at the subway line(case) level.

31

Page 33: Subway and Road Congestion - Academic Profileecon.msu.edu/seminars/docs/Congestion_Oct2018.pdfsubway lines in the sample often requires data that cover many cities over many years,

Figure 7: Case-by-Case Estimates

Guangzhou 7 (1st phase)Nanjing S3Nanning 1

Zhengzhou SuburbanDalian 1

Qingdao 3 (2nd phase)Tianjin 6

Nanchang 2Harbin 3

WUhan 8Chengdu 4 (2nd phase E)

Chongqing 10Suzhou 4

Chengdu 10Chongqing 5

Chengdu 7Changchun 1

Guiyang 1Xiamen 1 (1st phase)

Guangzhou 9Zhengzhou 1 (2nd phase)

Shenzhen 9Suzhou 2 (2nd phase)

Nanjing 4Shanghai 9

Guangzhou 6 (2nd phase)Zhengzhou 2

Fuzhou 1 (2nd phase N)Hangzhou 2 (1st phase NW)

Nanning 2Beijing 16

Shenzhen 7Hefei 1

Qingdao 2Chongqing Airport

Guangzhou 13Kunming 3

Wuhan YangluoXi'an 3

Wuhan AirportHefei 2

Wuhan 6Chengdu 4 (2nd phase W)

-.1 0 .1 .2 .3 .4

coeff.95% C.I.

Zhengzhou SuburbanNanjing S3

Tianjin 6Guangzhou 7 (1st phase)

Nanning 1Harbin 3

Chengdu 4 (2nd phase E)WUhan 8

Qingdao 3 (2nd phase)Dalian 1

Chongqing 10Hangzhou 2 (1st phase NW)

Nanchang 2Zhengzhou 1 (2nd phase)

Chongqing 5Chengdu 7

Suzhou 2 (2nd phase)Fuzhou 1 (2nd phase N)

Nanjing 4Chengdu 10

Guiyang 1Xiamen 1 (1st phase)

Hefei 1Guangzhou 9

Shenzhen 9Shanghai 9

Suzhou 4Nanning 2

Zhengzhou 2Changchun 1

Qingdao 2Guangzhou 13Wuhan Airport

Kunming 3Guangzhou 6 (2nd phase)

Wuhan YangluoChongqing Airport

Shenzhen 7Beijing 16

Wuhan 6Hefei 2Xi'an 3

Chengdu 4 (2nd phase W)

-.1 0 .1 .2 .3 .4

coeff.95% C.I.

Note: Graph on the left: case-by-case estimates with standard errors clustered at the subway line level, weeks relative to opening between -6 and 47are included in the sample. Graph on the right: weeks relative to opening between -6 and 4 are included in the sample.

32

Page 34: Subway and Road Congestion - Academic Profileecon.msu.edu/seminars/docs/Congestion_Oct2018.pdfsubway lines in the sample often requires data that cover many cities over many years,

Table 1: Subway Lines in the Sample

Panel A: Subway Lines Opened between August 1, 2016 and December 31, 2017City Line(s) Open Date City Line(s) Open Date

Beijing 16 12/31/16 Nanchang 2 8/18/17Beijing Xijiao 12/30/17 Nanjing 4 1/8/17

Changchun 1 6/30/17 Nanjing S3 12/6/17Chengdu 4 (2nd phase east and west) 6/2/17 Nanning 1 12/28/16Chengdu 10 9/6/17 Nanning 2 12/28/17Chengdu 7 12/6/17 Qingdao 3 (2nd phase) 12/18/16

Chongqing Airport 12/28/16 Qingdao 2 12/10/17Chongqing 5,10 12/28/17 Shanghai 9 12/30/17

Dalian 1 6/8/17 Shenzhen 7,9 10/28/16Foshan Guang-Fo 12/28/16 Suzhou 2 (2nd phase) 9/24/16Fuzhou 1 (2nd phase north) 1/6/17 Suzhou 4 4/15/17

Guangzhou 6 (2nd phase), 7 (1st phase) 12/28/16 Tianjin 6 8/6/16Guangzhou 9,13 12/28/17 Wuhan 6,Airport 12/28/16Hangzhou 2 (1st phase northwest) 7/3/17 Wuhan 8,Yangluo 12/26/17

Harbin 3 1/26/17 Xi’an 3 11/8/16Hefei 1 12/26/16 Xiamen 1 (1st phase) 12/31/17Hefei 2 12/26/17 Zhengzhou 2 8/19/16

Kunming 3 8/29/17 Zhengzhou 1 (2nd phase),Suburban 1/12/17Panel B: Exisiting or Planned Subway Lines in Cities with New Subway Openings

Beijing 15 12/30/10 Nanjing 1 9/3/05Changchun 3 6/30/11 Qingdao 3 12/16/15Changsha 2 4/29/14 Shanghai 10 2010Chengdu 1 9/27/10 Shanghai 11 extension 4/26/16

Chongqing 1 3/18/11 Shenzhen 5 6/22/11Dalian 2 5/22/15 Shenzhen 11 6/28/16Fuzhou 5 (1st phase) 2021 Suzhou 1 4/28/12

Guangzhou 6 12/28/13 Tianjin 1 12/28/84Guiyang 1 (old town) 2018 Wuhan 3 12/28/15

Hangzhou 4 (1st phase) 2/2/15 Xi’an 1 9/15/13Harbin 1 9/26/13 Xiamen 2 2019

Kunming 6 6/28/12 Zhengzhou 1 (1st phase) 12/28/13Nanchang 1 12/26/15

Panel C: Exisiting or Planned Subway Lines in Cities without New Subway OpeningsChangsha 2 4/29/14 Shaoxing 1 2022

Changzhou 1 2019 Shenyang 1 9/27/10Dongguan 1 2022 Shijiazhuang 3 2022

Hohhot 1 (1st phase) 2020 Taiyuan 2 2020Jinan R2 9/30/21 Urumqi 3 2021

Lanzhou 1 2018 Wuhu 2 2020Luoyang 2 2022 Wuxi 1 7/1/14Nantong 1 2021 Xuzhou 1 2019Ningbo 2 9/26/15

Notes: Opening dates of subway lines are from pages on Baidu Baike, Wikipedia, and various news sources.

33

Page 35: Subway and Road Congestion - Academic Profileecon.msu.edu/seminars/docs/Congestion_Oct2018.pdfsubway lines in the sample often requires data that cover many cities over many years,

Table 2: Summary Statistics of the Baseline Sample

# of # of avg speed congestion indexPanel A: treated road segments obs unique segments (km/h) average p5 p50 p95all road segments 491,237 7,395 31.67 1.73 1.07 1.5 3.22highways 2,401 36 77.77 1.26 .96 1.07 2.3urban expressways 11,388 219 50.58 1.81 1.07 1.49 3.46arterial streets 217,994 3,259 32.43 1.77 1.06 1.51 3.48sub-arterial streets 259,454 3,881 29.77 1.69 1.09 1.49 3.04

# of # of avg speed congestion indexPanel B: control road segments obs unique segments (km/h) average p5 p50 p95all road segments 759,444 12,178 31.54 1.7 1.08 1.51 3.01highways 3,515 56 64.22 1.54 1.06 1.35 2.88urban expressways 16,313 269 54.44 1.55 1.06 1.38 2.71arterial streets 292,803 4,743 34.1 1.7 1.06 1.48 3.14sub-arterial streets 446,813 7,110 28.77 1.7 1.1 1.53 2.94

Note: Each observation is a road segment-by-hour. Segments in the baseline regression sample are included.Treated road segments are those directly affected by the new subway lines.

Table 3: Baseline Difference-in-differences Estimations

(1) (2) (3) (4) (5) (6)treated×post 0.048*** 0.048*** 0.047*** 0.057*** 0.032*** 0.030***

(0.009) (0.010) (0.008) (0.009) (0.010) (0.009)[0.032,0.064] [0.030,0.067] [0.032,0.062] [0.041,0.077] [0.013,0.050] [0.013,0.047]

road segment FE X X X Xcase FE Xweek-to-open FE Xcase-by-week-to-open FE X X X X Xtreated×week-to-open polynomial 1 3 5

Note: The sample includes observations between 6 weeks prior to and 48 weeks since a subway line opening.Columns 1 and 2 also include a binary variable indicating treatment status. Standard errors clustered at the caselevel are in parentheses, * p < 0.1, ** p < 0.05, *** p < 0.01. Numbers in brackets show the 95% confidenceinterval of the corresponding coefficient from the wild bootstrap procedure.

34

Page 36: Subway and Road Congestion - Academic Profileecon.msu.edu/seminars/docs/Congestion_Oct2018.pdfsubway lines in the sample often requires data that cover many cities over many years,

Table 4: Robustness by Sample

Panel A: 48 weeks around subway opening(1) (2) (3) (4) (5) (6)

treated×post 0.036*** 0.039*** 0.031*** 0.047*** 0.051*** 0.051***(0.013) (0.013) (0.008) (0.010) (0.010) (0.009)

[0.012,0.060] [0.016,0.062] [0.016,0.048] [0.027,0.066] [0.032,0.071] [0.033,0.069]

Panel B: lines opened before Feb, 2017(1) (2) (3) (4) (5) (6)

treated×post 0.042*** 0.049*** 0.046*** 0.075*** 0.083*** 0.043***(0.014) (0.017) (0.015) (0.018) (0.011) (0.011)

[0.032,0.071] [0.029,0.076] [0.032,0.071] [0.047,0.099] [0.041,0.103] [-0.001,0.047]

Panel C: lines opened after Feb, 2017(1) (2) (3) (4) (5) (6)

treated×post 0.027* 0.030** 0.021** 0.029*** 0.035** 0.045***(0.015) (0.013) (0.009) (0.009) (0.013) (0.013)

[0.006,0.058] [0.013,0.059] [0.004,0.046] [0.014,0.053] [0.018,0.078] [0.011,0.060]

road segment FE X X X Xcase FE Xweek-to-open FE Xcase-by-week-to-open FE X X X X Xtreated×week-to-open polynomial 1 3 5

Note: The sample in Panel A includes observations between 48 weeks prior to and 48 weeks after a subway lineopening. The sample in Panel B includes cases of subway lines that opened before February 28, 2017 andobservations up to 6 weeks prior to and 48 weeks after the subway line opening. The Sample in Panel C includescases of subway lines that opened after February 28, 2017 and observations up to 36 weeks prior to and 4 weeksafter the subway line opening. Columns 1 and 2 also include a binary variable indicating treatment status.Clustered standard errors in parentheses. * p < 0.1, ** p < 0.05, *** p < 0.01. Numbers in brackets show the 95%confidence intervals of the corresponding coefficients from the wild residual bootstrap procedure.

35

Page 37: Subway and Road Congestion - Academic Profileecon.msu.edu/seminars/docs/Congestion_Oct2018.pdfsubway lines in the sample often requires data that cover many cities over many years,

Table 5: Regression Discontinuity Using Time as the Running Variable

Panel A: treated (1) (2) (3) (4) (5) (6) (7) (8)post 0.063*** 0.046*** 0.044*** 0.055*** 0.069*** 0.061*** 0.066*** 0.068***

(0.012) (0.012) (0.010) (0.013) (0.015) (0.013) (0.011) (0.012)post×weeks to open 0.023** 0.003

(0.011) (0.003)order of polynomial 1 3 5 5 1 3 5 5weeks in sample [-6,47] [-6,47] [-6,47] [-6,47] [-48,47] [-48,47] [-48,47] [-48,47]N 284,301 284,301 284,301 284,301 492,996 492,996 492,996 492,996

Panel B: control (1) (2) (3) (4) (5) (6) (7) (8)post 0.014*** 0.008 0.009* 0.017*** 0.026*** 0.011** 0.018*** 0.019***

(0.004) (0.005) (0.005) (0.006) (0.005) (0.005) (0.005) (0.005)post×weeks to open 0.017*** 0.001

(0.005) (0.001)order of polynomial 1 3 5 5 1 3 5 5weeks in sample [-6,47] [-6,47] [-6,47] [-6,47] [-48,47] [-48,47] [-48,47] [-48,47]N 358,152 358,152 358,152 358,152 681,948 681,948 681,948 681,948

Note: The dependent variable is the residual log speed. Standard errors are clustered at the treated subway line(case) level. * p < 0.1, ** p < 0.05, *** p < 0.01.

Table 6: Difference-in-Differences with Calendar Week

(1) (2) (3) (4)dep var residual log speedtreat×post 0.043*** 0.044*** 0.044*** 0.048***

(0.009) (0.010) (0.009) (0.012)link FE X X X Xcalendar week FE X X X Xweights X XN 375762 375762 2011256 2011256sample treated only treated and control

Note: Results from estimating Equation 5. In Column 2 we use the interaction-weighted estimator proposedby Abraham & Sun (2018). Column 4 uses uses the number of observations in the subway line and week-to-opening as share of total number of observations in the responding week-to-opening as weight. Standard errorsin parentheses, clustered at the subway line level. * p < 0.1, ** p < 0.05, *** p < 0.01.

36

Page 38: Subway and Road Congestion - Academic Profileecon.msu.edu/seminars/docs/Congestion_Oct2018.pdfsubway lines in the sample often requires data that cover many cities over many years,

Table 7: Heterogeneous Effects by Link’s Type and Location

(1) (2) (3) (4) (5)treat×post (TP) 0.025*** 0.025*** 0.038*** 0.042***

(0.005) (0.005) (0.008) (0.008)TP×directly affected links 0.015*

(0.009)TP×log dist to new subway -0.009** -0.008**

(0.003) (0.004)TP×dist to new subway

[0, 1km] 0.041***(0.007)

[1, 2km] 0.026***(0.006)

[2, 5km] 0.016***(0.006)

[5, 10km] 0.024**(0.006)

above 10km 0.021***(0.006)

TP×dist the nearest existing subway line -0.008(0.006)

(6) (7) (8) (9) (10)treat×post (TP) 0.024*** 0.024*** 0.024***

(0.005) (0.005) (0.005)TP×initial congestion index (de-meaned) 0.067***

(0.006)TP×parallel 0.003 0.020***

(0.003) (0.005)TP×parallel×log dist -0.012***

(0.004)TP×road type

highways and express 0.008 0.002(0.009) (0.018)

arterial 0.027*** 0.040***(0.006) (0.009)

sub arterial 0.025*** 0.038***(0.005) (0.008)

highways and express×log dist 0.004(0.009)

arterial ×log dist -0.010**(0.004)

sub arterial×log dist -0.009**(0.004)

Note: The dependent variable in all regressions is the weekly average residual log speed. All regressions includecase-by-week to opening fixed effects and case-by-road segment fixed effects. The number of observations is3,745,885. Standard errors clustered at the case level are in parentheses. * p < 0.1, ** p < 0.05, *** p < 0.01.

37

Page 39: Subway and Road Congestion - Academic Profileecon.msu.edu/seminars/docs/Congestion_Oct2018.pdfsubway lines in the sample often requires data that cover many cities over many years,

Table 8: Subway and Other Modes of Transportation

(1) (2) (3)bus ridership bus mileage # of carsper resident per 100 residents per 10,000 residents

(rides/person) (kilometers) (count/10,000 persons)subway ridership per resident -0.218** -5.312 -2.203

(0.103) (10.466) (3.467)city FE X X Xyear FE X X Xmean dependent variable 263.7 9623.1 2123.1N 529 249 426N of cities 36 36 36

Note: Data are from City Statistical Yearbooks published annually by the NationalBureau of Statistics. All variables are divided by population. Standard errors are clustered at the city level.

38

Page 40: Subway and Road Congestion - Academic Profileecon.msu.edu/seminars/docs/Congestion_Oct2018.pdfsubway lines in the sample often requires data that cover many cities over many years,

Table 9: Individual Trip Mode from Travel Diaries

Panel A: all individuals(1) (2) (3)

subway bus carsubway length in TAZ 0.084*** -0.042*** -0.025*

(0.013) (0.015) (0.014)mean dep var 0.158 0.411 0.453N 63710 63710 63710Panel B: households not moved since 2009

(1) (2) (3)subway bus car

subway length in TAZ 0.085*** -0.037** -0.028*(0.013) (0.016) (0.015)

mean dep var 0.156 0.415 0.455N 60591 60591 60591Panel C: households not moved and had a car since 2009

(1) (2) (3)subway bus car

subway length in TAZ 0.053* -0.056 0.004(0.031) (0.042) (0.038)

mean dep var 0.203 0.450 0.395N 12482 12482 12482hhd and ind chars X X XTAZ FE X X Xyear FE X X X

Note: The data come from the one-day travel diaries in Beijing Household Travel Surveys in 2010 and 2015. Theobservation is at the individual level. Individuals in the sample includes those who have trip records on the day ofsurvey. Trips that use only walking and bicycles are excluded. The outcome variable is a binary variable indicatingwhether a corresponding mode of transportation is used in a day’s trip. The explanatory variable of interest is thelength of subway lines in the Travel Analysis Zone (TAZ), the finest identifiable geographic unit in the datasets.Notice that a person can use more than one mode in a day’s trips. Household characteristics include dummiesfor household income brackets, home ownership, house type (commercial apartment, work unit dormitory, low-income housing, etc), whether having kids under age 5, household size. Individual characteristics include gender,age, indicators for educational levels, industry and occupation. Standard errors are clustered at the TAZ level. *p < 0.1, ** p < 0.05, *** p < 0.01.

39

Page 41: Subway and Road Congestion - Academic Profileecon.msu.edu/seminars/docs/Congestion_Oct2018.pdfsubway lines in the sample often requires data that cover many cities over many years,

Appendix

A More Details on the Data

Figure A.1: Source of Speed Data

Note: Screen shot from the web version of our data provider’s website. It shows the color colded roads in Beijing,at 8:30 AM on Sep 3, 2018 (Monday).

40

Page 42: Subway and Road Congestion - Academic Profileecon.msu.edu/seminars/docs/Congestion_Oct2018.pdfsubway lines in the sample often requires data that cover many cities over many years,

Figure A.2: Illustration of Road Segment Selection

LegendBefore20160801_stationBefore20160801_line20160801to20171231_station20160801to20171231_linebuffer

10 km

±

Note: Roads are shown in grey lines, which are from the OpenStreetMap. Colored lines and dots indicate subwaylines and stations, which are digitized manually from Baidu Maps. Yellow boxes are buffer zones around newsubway lines.

41

Page 43: Subway and Road Congestion - Academic Profileecon.msu.edu/seminars/docs/Congestion_Oct2018.pdfsubway lines in the sample often requires data that cover many cities over many years,

Figure A.3: Road Segments Directly Affected by the Subway

Note: Screen shots from data provider’s website.

42

Page 44: Subway and Road Congestion - Academic Profileecon.msu.edu/seminars/docs/Congestion_Oct2018.pdfsubway lines in the sample often requires data that cover many cities over many years,

Table A.1: Official Open Dates and Test Ride Periods

City Line Open Date Test Ride Dates SourcesShanghai 9 12/30/17

Beijing 16 12/31/16Beijing Xijiao 12/30/17 2/28/18 1∗

Nanjing S3 12/6/17Nanjing 4 1/8/17Nanning 1 12/28/16Nanning 2 12/28/17Xiamen 1 (first phase) 12/31/17 10/6/217-10/11/17 1,2Harbin 3 1/26/17Dalian 1 6/8/17 6/7/17 1Tianjin 6 8/6/16 7/4/16-7/19/16 1

Guangzhou Guang-Fo 12/28/16Guangzhou 7 (first phase) 12/28/16Guangzhou 6 (second phase) 12/28/16Guangzhou 13 12/28/17Guangzhou 9 12/28/17

Chengdu 4 (second phase east) 6/2/17Chengdu 4 (second phase west) 6/2/17Chengdu 7 12/6/17Chengdu 10 9/6/17

Hefei 1 12/26/16Hefei 2 12/26/17 12/6/17-12/10/17 1

Nanchang 2 8/18/17Kunming 3 8/29/17

Wuhan Airport 12/28/16Wuhan Yangluo 12/26/17Wuhan 6 12/28/16Wuhan 8 12/26/17

Shenzhen 7 10/28/16Shenzhen 9 10/28/16Fuzhou 1 (second phase north) 1/6/17 12/25/16-1/3/17 1,2Suzhou 4 4/15/17 3/25/17-3/31/17 1Suzhou 2 (second phase) 9/24/16

Xi’an 3 11/8/16Guiyan 1 12/28/17

Zhengzhou 2 8/19/16 8/10/16-8/19/16 1,2Zhengzhou 1 (second phase) 1/12/17Zhengzhou Suburban 1/12/17Chongqing 5 12/28/17 9/30/17-11/10/17 1Chongqing 10 12/28/17Chongqing Airport 12/28/16Changchun 1 6/30/17 6/25/17-6/29/17 1,2

Qingdao 3 (second phase) 12/18/16 12/7/16-12/9/16 1Qingdao 2 12/10/17 12/3/17-12/5/17 1

Hangzhou 2 (first phase northwest) 7/3/17∗ Beijing’s Xijiao Line opened on 12/30/17. There was an accident on 1/1/18, which rendered the line to be closed until 2/28/18.Notes: Test ride dates of new subway lines are from pages on Baidu Baike, Wikipedia, and various news sources.

43

Page 45: Subway and Road Congestion - Academic Profileecon.msu.edu/seminars/docs/Congestion_Oct2018.pdfsubway lines in the sample often requires data that cover many cities over many years,

B Speed Data VerificationIn the era of smartphones, hundreds of millions of drivers use digital navigation systems

every day. The company that provided us data is a leading player in this area in China. Itboasts 280 million active users at the monthly basis. Its accuracy in recommending optimalroutes depends heavily on the real-time traffic condition database, which our sample is basedon. To the best of our knowledge, this paper is the first in the urban economics literature to usesuch data. It is necessary to verify the quality of this new type of data.

Figure A.4 shows the average weekly log speed of control road segments near existing orplanned subway lines. All days are in the sample, including weekends and holidays. For logspeed, we first take residuals from regressing on the full set of segment-by-day of week-by-hourfixed effects. Therefore, the resdiual log speed should be interpreted as the percent deviationfrom the average speed of the segment in the given day of week and hour. Each cross is theresidual log speed averaged at the weekly level across all road segments.

There is clear seasonality in the road speed. Speed tends to increase in December and Jan-uary and falls between August and September. January is typically China’s holiday season.Early September is the time when the new academic year starts. The second graph marks outweeks that contain major holidays. These weeks have unusually higher road speed relative tothe same day of week in other parts of the year.

44

Page 46: Subway and Road Congestion - Academic Profileecon.msu.edu/seminars/docs/Congestion_Oct2018.pdfsubway lines in the sample often requires data that cover many cities over many years,

Figure A.4: Seasonality

-.1

0

.1

.2

resi

dual

log

spee

d

2016

-8

2016

-9

2016

-10

2016

-12

2017

-1

2017

-2

2017

-3

2017

-4

2017

-5

2017

-6

2017

-7

2017

-8

2017

-9

2017

-10

2017

-11

2017

-12

2018

-1

seasonality

Nation

al Day

New Yea

r

Chinese

New

Year

Chinese

New

Year

Qingming

Labor

DayDua

nwu

Nation

al Day

New Yea

r

-.1

0

.1

.2

resi

dual

log

spee

d

2016

-8

2016

-9

2016

-10

2016

-12

2017

-1

2017

-2

2017

-3

2017

-4

2017

-5

2017

-6

2017

-7

2017

-8

2017

-9

2017

-10

2017

-11

2017

-12

2018

-1

seasonality

Note: Weekly average residual log speed during rush hours (7AM-9AM, 5PM-7PM) in sample road segments.Log hour speed is first regressed against a full set of road segment-by-day of week-by-hour indicators. Weeksincluding national holidays are annotated.

Figure A.5 shows daily average of residual log speed in Chengdu. Chengdu had the mostsubway openings during the sample period. It has two opening dates that are not in the end ofthe year. Again, almost all of the days with particularly high residual log speed are holidays,while almost all the days with particularly low residual log speed are working weekends thatjust before or after a major holiday.1 The second graph marks the dates when new subway linesopen. A rough visual inspection suggests that there is no clear change in speed after a subwayline opening.

1Sometimes working days are switched between a weekday and a weekend to make consecutive non-workingdays in a holiday. For example, 2017 Qingming holiday lands on a Tuesday (April 4), the holiday extends 3 daysfrom April 2 (Sunday) to April 4, while April 1 (Saturday) is switched as a working day.

45

Page 47: Subway and Road Congestion - Academic Profileecon.msu.edu/seminars/docs/Congestion_Oct2018.pdfsubway lines in the sample often requires data that cover many cities over many years,

Figure A.5: Chengdu

-.4-.2

0.2

.4re

sidu

al lo

g sp

eed

01au

g201

6

31au

g201

6

30sep

2016

30oc

t2016

29no

v201

6

29de

c201

6

28jan

2017

27feb

2017

29mar2

017

28ap

r2017

28may

2017

27jun

2017

27jul

2017

26au

g201

7

25sep

2017

25oc

t2017

24no

v201

7

24de

c201

7

23jan

2018

Chengdu

Line 4

exten

sions

Line 10

open

ing

Line 7

open

ing

-.4-.2

0.2

.4re

sidu

al lo

g sp

eed

01au

g201

6

31au

g201

6

30sep

2016

30oc

t2016

29no

v201

6

29de

c201

6

28jan

2017

27feb

2017

29mar2

017

28ap

r2017

28may

2017

27jun

2017

27jul

2017

26au

g201

7

25sep

2017

25oc

t2017

24no

v201

7

24de

c201

7

23jan

2018

holidays workdays on weekends

Chengdu

Note: Daily average residual log speed during rush hours (7AM-9AM, 5PM-7PM) in sample road segments inChengdu. Log hour speed is first regressed against a full set of road segment-by-day of week-by-hour indicators.National holidays are annotated. Dates with new subway line openings are marked.

C Examples of Subway Construction Timelines

Beijing Line 16

• 5/4/2011 — Construction plan was approved by the Beijing municipal government.

• 10/26/2012 — Study on environmental impacts was completed. One station was re-moved from the plan.

• 4/25/2013 — Designs for the subway stations were on display to the public.

• 12/10/2013 — Second round of evaluation study on the environmental impacts was pub-lished. The evaluation was released to the public for comments.

46

Page 48: Subway and Road Congestion - Academic Profileecon.msu.edu/seminars/docs/Congestion_Oct2018.pdfsubway lines in the sample often requires data that cover many cities over many years,

• 12/12/2013 — Construction officially started.

• 8/16/2016 — Scale-test cars were run on completed tracks.

• 9/5/2016 — Tracks were electrified.

• 9/20/2016 – Passenger trains were tested without passengers on board.

• 12/31/2016 — Line opened officially.

Suzhou Line 4

• 6/18/2012 — Evaluation of environmental impacts was approved by the Ministry of En-vironmental Protection.

• 6/19/2012 — Plans of land use along the subway line was approved by the Ministry ofLand and Resources.

• 9/27/2012 — Construction officially started.

• 9/12/2013 — First tunnel boring shield machine was deployed.

• 12/18/2014 — Tracks were laid.

• 8/16/2015 — Tunnels were completed.

• 12/18/2015 — Tracks were completed.

• 4/9/2016 — Tracks were electrified.

• 12/28/2016 — Test rides were run with un-passengered trains.

• 3/13/2017 — The subway line passed the final examination by a panel of experts and wasaccepted as suitable for operation.

• 3/20/2017-3/24/2017 — Free tickets were handed out to residents for test rides.

• 3/25/2017-3/31/2017 — Test rides with passengers.

• 4/15/2017 — Line opened officially.

47

Page 49: Subway and Road Congestion - Academic Profileecon.msu.edu/seminars/docs/Congestion_Oct2018.pdfsubway lines in the sample often requires data that cover many cities over many years,

D Additional Empirical Results

Figure A.6: Dynamic Effects for Subgroups

Subway Lines Opened before Feb 28, 2017-.0

50

.05

.1.1

5co

eff.

-8 -4 0 4 8 12 16 20 24 28 32 36 40 44 48weeks to subway opening

weekly

-.05

0.0

5.1

.15

coef

f.

-8 -4 0 4 8 12 16 20 24 28 32 36 40 44 48weeks to subway opening

weekly monthly

-.05

0.0

5.1

.15

coef

f.

-8 -4 0 4 8 12 16 20 24 28 32 36 40 44 48weeks to subway opening

weekly monthly quarterly

-.05

0.0

5.1

.15

coef

f.

-8 -4 0 4 8 12 16 20 24 28 32 36 40 44 48weeks to subway opening

weekly monthly quarterly post average

Subway Lines Opened after Feb 28, 2017

-.05

0.0

5.1

coef

f.

-36 -32 -28 -24 -20 -16 -12 -8 -4 0 4

weeks to subway opening

weekly

-.05

0.0

5.1

coef

f.

-36 -32 -28 -24 -20 -16 -12 -8 -4 0 4

weeks to subway opening

weekly monthly

-.05

0.0

5.1

coef

f.

-36 -32 -28 -24 -20 -16 -12 -8 -4 0 4

weeks to subway opening

weekly monthly quarterly

-.05

0.0

5.1

coef

f.

-36 -32 -28 -24 -20 -16 -12 -8 -4 0 4

weeks to subway opening

weekly monthly quarterly post average

Note: The sample in the first set of graphs includes cases of subway lines that opened before February 28, 2017.Observations up to 6 weeks prior to and 48 weeks after the subway line opening are included. The sample in thesecond set of graph includes cases of subway lines that opened after February 28, 2017. Observations up to 36weeks prior to and 4 weeks after the subway line opening are included.

48

Page 50: Subway and Road Congestion - Academic Profileecon.msu.edu/seminars/docs/Congestion_Oct2018.pdfsubway lines in the sample often requires data that cover many cities over many years,

Figure A.7: Adjusting for Few Treated Groups (Ferman and Pinto, 2018)

Guangzhou 7 (1st phase)Nanjing S3Nanning 1

Zhengzhou SuburbanDalian 1Tianjin 6

Qingdao 3 (2nd phase)Chengdu 4 (2nd phase E)

Harbin 3Nanchang 2

WUhan 8Chongqing 10

Suzhou 4Chengdu 10

Chongqing 5Chengdu 7

Changchun 1Xiamen 1 (1st phase)

Guiyang 1Guangzhou 9

Zhengzhou 1 (2nd phase)Shenzhen 9

Suzhou 2 (2nd phase)Nanjing 4

Shanghai 9Guangzhou 6 (2nd phase)

Zhengzhou 2Hangzhou 2 (1st phase NW)

Fuzhou 1 (2nd phase N)Beijing 16Nanning 2

Shenzhen 7Hefei 1

Qingdao 2Chongqing Airport

Guangzhou 13Kunming 3

Wuhan YangluoXi'an 3

Wuhan AirportHefei 2

Wuhan 6Chengdu 4 (2nd phase W)

-.4 -.2 0 .2 .4 .6

coeff.bootstrapped 95% C.I.

Note: 95% confidence intervals suggested by Fermand and Pinto (2018) are included.

49