4
Exploration of Sample Quantiles and Mean Speed Distributions Nordiana Mashros, Raha Rahman, Johnnie Ben-Edigbe Faculty of Civil Engineering Universiti Teknologi Malaysia Skudai, Malaysia [email protected], [email protected], [email protected] AbstractIn highway traffic analysis, speed and traffic flow are essential parameters. The 85 th and 15 th percentiles are commonly used. Comparative assessments of qualitative service often rely on descriptive statistics of speeds. In some studies, the focus has been on sampling distribution of the mean instead of the sampling distribution of the percentile while some have applied the binomial test for percentiles. This paper aims to explore two statistical tests on 85 th percentile speed using an impact study of rainfall carried out in Terengganu. Hypothesis was set up for performing simple percentile or quantile test and sample mean test. The normality of the data was evaluated and results show that speed reduction was statistically significant in all cases. The paper concluded statistical test for percentiles based on Crammer theory of asymptotic distribution of sample quantiles is a better fit when testing for statistical significance of speed changes occasioned by rainfall. Keywords-quantiles; cumulative speed distribution; 85 th percentile speed; 15 th percentile speed; mean speed I. INTRODUCTION Speed and traffic flow are two parameters that often used in traffic engineering as an indicator to achieve the safe and efficient movement of people and goods on roadways. They are also important elements in comparative assessments of qualitative service in response to the prevailing traffic conditions. In most cases, speed percentiles, also known as quantiles are tools used when effectiveness of highway service or safety is a main concern. The most important speed percentiles are 85 th and 15 th percentiles. The 85 th percentile is defined as the speed at or below which 85 percent of vehicles moving and it is the primary guide in determining what speeds the majority of safe and reasonable drivers are travelling. In other words, this percentile is normally used in evaluating or recommending posted speed limits as it is assumed to be the highest safe speed for a roadway section. The 15 th percentile is the speed at or below which 15 percent of vehicles are travelling. This value is useful in determining the allowable speed limit because the vehicles travelling below this speed tend to obstruct the flow of traffic, thus would increase the potential of accident. The 85 th and 15 th percentiles speed require field measurements and if field measurements are going to be made, analyst might as well directly measure the mean speed. Mean speed is a measure of the central tendency of the data and calculated as the sum of all speeds divided by the number of speed observations. This speed is believed as an indicator of average travel speed. There are several statistical tools for percentiles, however some of researchers choose to neglect statistical analysis because of complex procedure for conducting statistical test. In some studies, the method applied was focuses on averaging percentile instead of ideal methods while some have applied the binomial test for percentiles. Yet others have described their methodology in detail while avoiding the toxic concept of statistical testing. They argue that existing literatures show scanty evidence of sound theory on statistical test for percentiles. Assuming that is true, it begs the question, why should the lack of a simple percentile test be the reason for using binomial proportion or sample mean tests?. Based on the null hypothesis that there is no statistical significance of the difference in percentile speed between two sample populations at 95% confidence level, this paper aims to explore two statistical tests on 85 th percentile speed using a case study of the impact of rainfall on vehicle speed along a highway segment that was carried out in Terengganu. The statistical tests chosen for this study as indicators of the effectiveness are based on: (i) Crammer’s theory of asymptotic distribution of sample quantiles and (ii) averaging percentiles. The remainder of the paper has been divided into four sections. The immediate section deals with literature review. It presents some statistical tests for percentile and mean speed. In section 3, attention is focused on the setup of the impact study and data collection. Empirical results and findings are discussed in section 4. The conclusion drawn from the study is presented in section 5. II. LITERATURE REVIEW The comparison of two sample populations is very common in analytical works for engineers and scientists. Nonparametric double bootstapping, quantile regression, binomial test and averaging percentiles are among statistical tools that widely used to compare the percentile speeds of different populations. The first two methods are considered as sophisticated method for comparative assessment of percentile. Double nonparametric bootstrap procedure [1] is a simulation method based upon resampling of existing data. There are two procedures involved: the first bootstrapping is used to produce estimates the standard errors for the desired percentiles and the second bootstrapping is used to get the threshold cutoff values for the test of hypothesis or confidence interval. This statistical test is beneficial in that it does not require populations to follow specific distributions and to have balanced sample sizes or equal variances [2]. Besides, it also allows direct statistical inferences to be drawn on differences between the percentiles of two or more sample populations [3]. A nonparametric double

[IEEE 2012 International Conference on Statistics in Science, Business and Engineering (ICSSBE2012) - Langkawi, Kedah, Malaysia (2012.09.10-2012.09.12)] 2012 International Conference

  • Upload
    johnnie

  • View
    214

  • Download
    1

Embed Size (px)

Citation preview

Exploration of Sample Quantiles and Mean Speed Distributions

Nordiana Mashros, Raha Rahman, Johnnie Ben-Edigbe Faculty of Civil Engineering

Universiti Teknologi Malaysia Skudai, Malaysia

[email protected], [email protected], [email protected]

Abstract— In highway traffic analysis, speed and traffic flow are essential parameters. The 85th and 15th percentiles are commonly used. Comparative assessments of qualitative service often rely on descriptive statistics of speeds. In some studies, the focus has been on sampling distribution of the mean instead of the sampling distribution of the percentile while some have applied the binomial test for percentiles. This paper aims to explore two statistical tests on 85th percentile speed using an impact study of rainfall carried out in Terengganu. Hypothesis was set up for performing simple percentile or quantile test and sample mean test. The normality of the data was evaluated and results show that speed reduction was statistically significant in all cases. The paper concluded statistical test for percentiles based on Crammer theory of asymptotic distribution of sample quantiles is a better fit when testing for statistical significance of speed changes occasioned by rainfall.

Keywords-quantiles; cumulative speed distribution; 85th percentile speed; 15th percentile speed; mean speed

I. INTRODUCTION Speed and traffic flow are two parameters that often used

in traffic engineering as an indicator to achieve the safe and efficient movement of people and goods on roadways. They are also important elements in comparative assessments of qualitative service in response to the prevailing traffic conditions. In most cases, speed percentiles, also known as quantiles are tools used when effectiveness of highway service or safety is a main concern. The most important speed percentiles are 85th and 15th percentiles. The 85th percentile is defined as the speed at or below which 85 percent of vehicles moving and it is the primary guide in determining what speeds the majority of safe and reasonable drivers are travelling. In other words, this percentile is normally used in evaluating or recommending posted speed limits as it is assumed to be the highest safe speed for a roadway section. The 15th percentile is the speed at or below which 15 percent of vehicles are travelling. This value is useful in determining the allowable speed limit because the vehicles travelling below this speed tend to obstruct the flow of traffic, thus would increase the potential of accident. The 85th and 15th percentiles speed require field measurements and if field measurements are going to be made, analyst might as well directly measure the mean speed. Mean speed is a measure of the central tendency of the data and calculated as the sum of all speeds divided by the number of speed observations. This speed is believed as an indicator of average travel speed.

There are several statistical tools for percentiles, however some of researchers choose to neglect statistical analysis because of complex procedure for conducting statistical test. In some studies, the method applied was focuses on averaging percentile instead of ideal methods while some have applied the binomial test for percentiles. Yet others have described their methodology in detail while avoiding the toxic concept of statistical testing. They argue that existing literatures show scanty evidence of sound theory on statistical test for percentiles. Assuming that is true, it begs the question, why should the lack of a simple percentile test be the reason for using binomial proportion or sample mean tests?. Based on the null hypothesis that there is no statistical significance of the difference in percentile speed between two sample populations at 95% confidence level, this paper aims to explore two statistical tests on 85th percentile speed using a case study of the impact of rainfall on vehicle speed along a highway segment that was carried out in Terengganu. The statistical tests chosen for this study as indicators of the effectiveness are based on: (i) Crammer’s theory of asymptotic distribution of sample quantiles and (ii) averaging percentiles. The remainder of the paper has been divided into four sections. The immediate section deals with literature review. It presents some statistical tests for percentile and mean speed. In section 3, attention is focused on the setup of the impact study and data collection. Empirical results and findings are discussed in section 4. The conclusion drawn from the study is presented in section 5.

II. LITERATURE REVIEW The comparison of two sample populations is very

common in analytical works for engineers and scientists. Nonparametric double bootstapping, quantile regression, binomial test and averaging percentiles are among statistical tools that widely used to compare the percentile speeds of different populations. The first two methods are considered as sophisticated method for comparative assessment of percentile. Double nonparametric bootstrap procedure [1] is a simulation method based upon resampling of existing data. There are two procedures involved: the first bootstrapping is used to produce estimates the standard errors for the desired percentiles and the second bootstrapping is used to get the threshold cutoff values for the test of hypothesis or confidence interval. This statistical test is beneficial in that it does not require populations to follow specific distributions and to have balanced sample sizes or equal variances [2]. Besides, it also allows direct statistical inferences to be drawn on differences between the percentiles of two or more sample populations [3]. A nonparametric double

bootstapping method was used by Brewer et al. [4] on the 85th percentile speed in a work zone speed limit study, while Voigt et al. [5] performed this test on the 85th percentile speed to investigate the impact of dual-advisory warning signs on speed reduction on freeway-to-freeway connectors in Texas. Quantile regression method [6] is a type of regression analysis commonly used in econometrics. It is considered a natural extension of ordinary least squares that estimate the conditional means to the conditional quantiles. This method builds a linear model relating desired quantiles to intervention factors then estimates the standard error of desired quantiles through the standard error of model parameters [2].

Another method used is binomial test. Pesti and McCoy [7] for example use this test to assess the statistical significance of differences in the 85th percentile speeds for evaluating long-term effectiveness of speed monitoring displays in work zones on rural interstate highways. Averaging method is often utilised when analyzing percentile speed. Researchers such as Agent et al. [8] calculate average change in the 85th percentile speed at several data collection sites in their speed limit study in Kentucky. While, Mattox et al. [9] evaluated the effectiveness of speed-activated signs in work zones by averaging reduction in 85th percentile speed from several speed data sets. Some of them executed statistical test on average percentile. For instance, Hildebrand et al. [10] performed statistical test for providing evidence of a statistically significant change on the average in 85th percentile speeds from many rural highway temporary work zone sites. The statistical test that generally used for comparing percentile is parametric test. For instance, Garber and Srinivasan [11] employed t-statistic in their study to test significant reduction in 85th percentile speed. The t-statistic is given by:

A

A

B

B

AB

ns

ns

XXt22

)(

��

(1)

where BX , AX are the mean speed for the before and after periods, sB , sA are the standard error of speed for the before and after periods, and nB, nA are the sample size in before and after periods. If parametric testing is used on non-normal samples with relatively small sample sizes, findings of statistically significant differences between compared samples could be invalid [12]. In addition if the sample data has normal distribution, it could be inaccurate to test the sample data when statistic of interest is 85th percentile speed [13]. The reason is 85th percentile speed is not a parameter that defines the normal distribution, therefore a parametric hypothesis test could not be conducted [14].

Even though the above statistical methods were applied in many studies, the most recent study found that there are some problems associated with existing methods [2]. A nonparametric double bootstapping and the quantile regression are fairly complex methods and not easy to apply. The use of binomial test and averaging percentiles for analyzing percentile values is questionable because it is perhaps not the most appropriate fit. Since there is a lack of a statistical test for comparing percentiles that can be easily applied and is theoretically sound, some studies have not pursued statistical analysis such as Eckenrode et al. [14].

Realising these issues, Hou et al. [2] proposed statistical test for the 85th and 15th percentiles based on Crammer’s theory of asymptotic distribution of sample quantiles. This theory has been in existence for many years as derived by Crammer [15]. Normality of data is required for accuracy of the quantile test. The proposed statistical test is almost similar with the statistical test used by Knoblauch et al. [16] on 15th percentile pedestrian crossing speeds. However, the estimated value of the standard error was somewhat different. The statistical test is fully developed for 85th percentiles speed with the assumption that 15th percentile speed has the same form because of the symmetry of the normal distribution. The difference can be compared using the test statistic below when the sample size reached approximately 200.

yyxx

nn

nSnS

YX

//530.1

0)(22

)1]85.0([)1]85.0([

�� �� (2)

where X ([n0.85] 1) and Y([n0.85] 1) are the 85th sample quantiles from independent normal distributions, X n and Y n are sample sizes, and Sx and Sy are the sample variances. The accuracy of this test is compromised if the data is not normally distributed.

III. SETUP OF THE IMPACT STUDY AND DATA COLLECTION

Two automatic traffic counters were installed on the road section of a single carriageway at federal route 3 in Terengganu, Malaysia as shown in Figure 1 to obtain traffic data. It recorded detail vehicular information such as speed, direction of travel, headway, gap, length of wheelbase, vehicle classification and time and date continuously for 12 weeks during the monsoon period when they traversed the observation point. Rainfall data consisted of rain intensity, time and date obtained from the nearest rain gauge station. Rainfall data was coupled with the traffic data to identify traffic data during rainfall spells and dry condition. Only daylight traffic data have been used in this paper.

Figure 1. Setup of impact study

IV. EMPIRICAL RESULTS AND FINDINGS Table 1 presents the descriptive statistics of speeds from

federal route 3 in Terengganu. Mean speed and 85th percentile speed decrease under rainfall condition. Standard deviations for large sample sizes under dry and rainfall conditions are around 13 to 14 km/hr. The t-test was

120m 1m1m1m 120m

Intersection

straight, flat terrain, good pavement condition

Automatic Traffic Counter

performed to assess the statistical significance of the difference in mean speeds between dry and rainfall conditions. The results of t-test for cases 1 and 2 shown in Table 2 suggested that the difference in mean speeds for all cases was statistically significance.

TABLE 1. SUMMARIES OF SPEED DATA

Case 1 2 Condition Dry Rainfall Dry Rainfall

Mean Speed (km/hr) 74.6 70.4 73.5 69.7

85th Percentile (km/hr) 88.3 83.9 88.2 83.9

Standard Deviation (km/hr)

13.8 13.2 14.4 13.8

Sample size (vehicles) 27086 31386 23012 28857

TABLE 2. T-TEST RESULTS FOR SPEED DATA

Case 1 2

Hypothesis H0 : µD = μR H1 : µD > μR

H0 : µD = μR H1 : µD >μR

Mean speed under dry condition (km/hr) 74.6 73.5

Mean speed under rainfall condition (km/hr) 70.4 69.7

Change (km/hr) -4.2 -3.8 p-value 0.000 0.000

Reject null hypothesis? Yes Yes Figures 2 and 3 illustrate the quantile-quantile (Q-Q)

goodness of fit plots under dry and rainfall conditions for cases 1 and 2, respectively. The Q-Q plots demonstrated that the speed distributions in all cases have normal-like curves. The quantile test suggested by Hou et al. [2] was used to 85th percentile speed. Results of the 85th percentile speed tests are shown in Table 3. Results indicated that 85th percentile speed were statistically significant difference for all cases. Therefore, it is clear that 85th percentile speed under dry condition is statistically higher than speed under rainfall condition.

Figure 2. Cumulative speed distribution for dry and rainfall conditions for case 1

Figure 3. Cumulative speed distribution for dry and rainfall conditions for case 2

TABLE 3. RESULT OF QUANTILE TEST ON 85TH PERCENTILE SPEED

Case 1 2

Hypothesis H0 : P0.85D = P0.85R

H1 : P0.85D > P0.85R H0 : P0.85D = P0.85R

H1 : P0.85D > P.85R 85th percentile speed under

dry condition (km/hr) 88.3 88.2

85th percentile speed under rainfall condition (km/hr) 83.9 83.9

Change (km/hr) -4.4 -4.3 p-value 0.000 0.000

Reject null hypothesis? Yes Yes The summary of average 85th percentile speeds is

presented in Table 4. Result shown that average 85th percentile speeds for dry condition reduced by 4.4 km/hr when the rain fall. T-test was performed on these speed data and the result is given in Table 5. Result indicated that the decrease in average 85th percentile speed was found to be statistically significant.

TABLE 4. AVERAGE 85TH PERCENTILE SPEED

85th Percentile Speed under Dry Condition

(km/hr)

85th Percentile Speed under Rainfall Condition

(km/hr) Case 1 88.3 83.9 Case 2 88.2 83.9

Average 88.3 83.9

TABLE 5. T-TEST RESULT FOR AVERAGE 85TH PERCENTILE SPEED

Hypothesis H0 : P0.85D = P0.85R

H1 : P0.85D > P0.85R Average 85th percentile speed under dry

condition (km/hr)

88.3

Average 85th percentile speed under rainfall condition

(km/hr) 83.9

Change (km/hr) -4.4 p-value 0.000

Reject null hypothesis? Yes

V. CONCLUSIONS Comparative assessments on 85th percentile speed

between dry and rainfall conditions have been presented in this paper. The normally distributed sample data allowed the use of quantile test and t-test to test the significant between different sample populations. Quantile test, which is developed using Crammer’s derivation of the asymptotic

distribution of sample quantiles was used for comparing 85th percentile speed for each case, while t-test was used for comparing average 85th percentile speed. Comparison results of 85th percentile speed during dry and rainfall conditions were found statistically significant in all cases. Both statistical tests are simple to use, however the accuracy of t-test can be argued eventhough this statistical test provides evidence of a statistically significant change in average percentile between different sample populations. Therefore, it can be concluded that statistical test for percentiles based on Crammer theory of asymptotic distribution of sample quantiles is a better fit when testing for statistical significance of speed changes occasioned by rainfall.

REFERENCES [1] C. Spiegelman, and T. J. Gates, “Post hoc quantile test for one-way

ANOVA using a double bootstrap method,” In Transportation Research Record: Journal of the 14 Transportation Research Board, No. 1908, Transportation Research Board of the National Academies, Washington, D.C., pp. 19–25, 2005.

[2] Y. Hou, C. Sun, P. Edara, “A statistical test for 85th And 15th percentile speeds using the asymptotic distribution of sample quantiles’” in: The 91st Annual meeting of the Transportation Research Board of the National Academies, Washington D.C., January 2012.

[3] T.J. Gates, H.G. Hawkins, Jr., S. T. Chrysler, P.J. Carlson, A.J. Holick, and C.H. Spiegelman, Traffic Operational Impacts of Higher-Conspicuity Sign Materials, FHWA/TX-04/4271-1, 2003.

[4] M.A. Brewer, G. Pesti, and W. Schneider, “Improving compliance with work zone speed limits: effectiveness of selected devices,” In Transportation Research Record: Journal of the Transportation Research Board, No. 1948, Transportation Research Board of the National Academies, Washington, D.C., pp. 67–76, 2006.

[5] A.P. Voigt, C.R. Stevens, and D.W. Borchardt, “Dual-advisory speed signing on freeway-to-freeway connectors in Texas,” In Transportation Research Record: Journal of the Transportation Research Board, No. 2056, Transportation Research Board of the National Academies, Washington, D.C., pp. 87-94, 2008.

[6] P.J. Hewson, “Quantile regression provides a fuller analysis of speed data,” Accident Analysis and Prevention,Vol. 40, No. 2, pp. 502-510, 2008.

[7] G. Pesti and P.T. McCoy. “Long-term effectiveness of speed monitoring displays in work zones on rural interstate highways,” Paper No. 01-2789, Transportation Research Board, 80th Annual MeetingWashington, D.C, January 7-11, 2001.

[8] K.R. Agent, J.G. Pigman, and J.M. Webber, “Evaluation of speed limits in Kentucky,” In Transportation Research Record: Journal of the Transportation Research Board, No. 1640, Transportation Research Board of the National Academies, Washington, D.C., pp. 57-64, 1998.

[9] J.H. Mattox, W.A. Sarasua, J.H. Ogle, R.T. Eckenrode, and A. Dunning, “Development and evaluation of speed-activated sign to reduce speeds in work zones,” In Transportation Research Record: Journal of the Transportation Research Board, No. 2015, Transportation Research Board of the National Academies, Washington, D.C., pp. 3-11, 2007.

[10] E.D. Hildebrand, F.R. Wilson, and J.J. Copeland, “Speed management strategies for rural temporary work zones,” In Proceedings of Canadian Multidisciplinary Road Safety Conference XIII, Banff, Alberta, June 8-11, 2003.

[11] N.J. Garber, and S. Srinivasan. Effectiveness of Changeable Message Signs in Work Zones: Phase II. Report No. VTRC 98-R 10. Virginia: Virginia Transportation Research Council, 1998.

[12] J.L. DeVore, Probability and Statistics for Engineering and the Sciences, 5th Edition, Duxbury, Belmont, California, 2000.

[13] A.P. Voigt, P.E., C.R. Stevens, Jr., and D.W. Borchardt, P.E., Guidelines for Dual-Advisory Speed Signing on Freeway-to-Freeway Connectors in Texas, Technical Report No. FHWA/TX-07/0-4813-1, 2007.

[14] R.T. Eckenrode, W.A. Sarasua, Mattox, J. H., Ogle, and M. Chowdhury, “Revisiting the use of drone radar to reduce speed in work zones: South Carolina’s experience,” In Transportation Research Record: Journal of the Transportation Research Board, No. 2015, Transportation Research Board of the National Academies, Washington, D.C., pp. 19-27, 2007.

[15] H. Crammer, Mathematical methods of statistics. Princeton University Press, 1946.

[16] R.L. Knoblauch, M.T. Pietrucha, and M. Nitzberg, “Field studies of pedestrian walking speed and start-up time,” In Transportation Research Record: Journal of the Transportation Research Board, No. 1538, Transportation Research Board of the National Academies, Washington, D.C., pp. 27–38, 2006.