32
Analysis of Real-World Data Static Stability Factor and the Risk of Rollover April 11, 2001

Analysis of Real-World Data

  • Upload
    ellie

  • View
    22

  • Download
    1

Embed Size (px)

DESCRIPTION

Analysis of Real-World Data. Static Stability Factor and the Risk of Rollover April 11, 2001. References. Federal Register , June 1, 2000 Description of the original linear regression analysis Federal Register , January 12, 2001 Description of the updated linear regression analysis - PowerPoint PPT Presentation

Citation preview

Page 1: Analysis of Real-World Data

Analysis of Real-World Data

Static Stability Factor

and the Risk of Rollover

April 11, 2001

Page 2: Analysis of Real-World Data

2

References

• Federal Register, June 1, 2000– Description of the original linear regression

analysis

• Federal Register, January 12, 2001– Description of the updated linear regression

analysis– Comparison with logistic regression analysis

Page 3: Analysis of Real-World Data

3

Need to Specify

• Vehicles

• Calendar years

• States

• Crash types

• Variables

• Statistical model

Page 4: Analysis of Real-World Data

4

Criteria for Selecting Vehicles

• Reliable estimate of the Static Stability Factor (SSF)

• Model years 1988 and later

• Sources include:– Vehicles tested by the agency– Passenger cars tested by General Motors

Page 5: Analysis of Real-World Data

5

Vehicles Selected

• 100 vehicle model groups, including:– 36 cars– 30 SUVs– 13 vans– 21 pickup trucks

Page 6: Analysis of Real-World Data

6

Criteria for Selecting Calendar Years

• Vehicle Identification Numbers (VINs) for that year had been decoded and included in the State Data System (SDS)

• Wanted multiple years to maximize data available for analysis

Page 7: Analysis of Real-World Data

7

Calendar Years Selected

• 1994-1997 for the original linear regression analysis

• 1994-1998 for the updated linear regression analysis and the logistic regression analysis

Page 8: Analysis of Real-World Data

8

Criteria for Selecting States

• Part of the SDS

• Provided 1994-1998 calendar year data

• Include VIN on the crash file

• Identify rollover occurrence even if it is not the first harmful event in the crash

Page 9: Analysis of Real-World Data

9

States Selected

• Florida

• Maryland

• Missouri

• North Carolina

• Pennsylvania

• Utah

Page 10: Analysis of Real-World Data

10

Other SDS VIN States

• VIN available for fatalities only– Kansas

• VIN added in 1998– Georgia

• Incomplete rollover information– New Mexico– Ohio

Page 11: Analysis of Real-World Data

11

Criteria for Selecting Crashes

• Single-vehicle crashes of study vehicles

• Excluded crashes with other participants– Pedestrian, pedalcyclist, animal, or train

• Excluded certain unusual situations– No driver, parked vehicle, pulling a trailer, or

emergency use (ambulance, fire, police, or military)

Page 12: Analysis of Real-World Data

12

Crashes Selected

• 241,036 single-vehicle crashes, including

48,996 rollovers

• This is 0.20 rollovers per single-vehicle crash, consistent with the national estimate from the General Estimates System for these calendar years and vehicle groups

Page 13: Analysis of Real-World Data

13

Criteria for Selecting Variables

• Variables describing purpose of study– Rollover (yes or no)– SSF (study values range from 1.00 to 1.53)

• Confounding factors– Environmental and driver factors that describe

how the vehicle was used– Want variables correlated with rollover risk,

including travel speed

Page 14: Analysis of Real-World Data

14

Variables Selected

• Rollover• SSF • Dichotomous variables based on:

– Environmental factors (light condition, weather, urbanization, speed limit, road grade, road curve, road condition, surface condition)

– Driver factors (sex, age, insurance coverage, alcohol/drug use)

• Number of occupants in the vehicle

Page 15: Analysis of Real-World Data

15

Summary of Available Data

• Six states• Five calendar years (1994-1998)• 100 vehicle groups with a reliable estimate of SSF• 14 confounding variables, including

10 available in all six states• 241,036 single-vehicle crashes, including 48,996 rollovers

Page 16: Analysis of Real-World Data

16

Limitations

• Pennsylvania dropped key road use variables (grade and curve) from its electronic file in 1998, so 1998 Pennsylvania data were not used here

• Some variables were not available for all six states (urbanization, road condition, insurance coverage, and number of occupants in vehicle)– Could not be used in analysis of combined data– Were used in logistic analysis of individual states

• Reporting practices vary by state

Page 17: Analysis of Real-World Data

17

Statistical Models

• Linear model of summarized data

• Logistic models of individual crashes

Page 18: Analysis of Real-World Data

18

Preparing Data for the Linear Model

• Limited to state-vehicle groups with at least 25 observations– 518 state-vehicle groups used in analysis

• Percentage involvement calculated for each variable, for each state-vehicle group– Values ranged from 0 to 1

– For example:• Rollover risk described by rollovers per single-vehicle crash

• Urbanization described by percent of crashes on rural roads

Page 19: Analysis of Real-World Data

19

Specifying Linear Model Form

• Dependent variable = LOG(rollover risk)– Rollover risk set at 0.0001 for state-vehicle groups with

no rollovers so they can be included in model

• Five dummy variables used to capture state-to-state differences in reporting practices– Missouri used as baseline case

• Linear regression of the rollover variable as a function of the summarized explanatory variables and the state dummy variables

Page 20: Analysis of Real-World Data

20

Fitting the Linear Model

• Each summary data point was weighted by the sample size, capped at 250 as a trade-off between two considerations– Sample size affects reliability of estimates– Model should fit over entire range of SSF

• Stepwise procedure used forward variable selection and a significance level of 0.15 for entry and removal from the model

Page 21: Analysis of Real-World Data

21

Results of the Linear Model

• Model selected six confounding factors (DARK, FAST, CURVE, MALE, YOUNG, and DRINK) and all five state dummies

• R2 = 0.88 for the model of rollover risk as a function of state, road use variables, and SSF

• SSF variable coefficient was:– Important in terms of the size of the estimated effect

– Highly significant in the model (P<0.0001)

Page 22: Analysis of Real-World Data

22

Predictions from the Linear Model

• Model describes rollover risk as a function of the explanatory variables and can be used to:– Estimate rollover risk as a function of the SSF for any

mix of road-use conditions– Adjust the observed rollover rate for each summary

data point to account for differences in vehicle use

• Next graph shows results for average conditions observed in the study data as a whole

• Rollover risk is estimated as 0.20 in both the adjusted and the unadjusted data

Page 23: Analysis of Real-World Data

23

Fit of Linear Model

0.00

0.10

0.20

0.30

0.40

0.50

0.60

0.90 1.00 1.10 1.20 1.30 1.40 1.50 1.60

Static Stability Factor

Rol

love

rs p

er S

ingl

e-V

ehic

le C

rash

Page 24: Analysis of Real-World Data

24

Interpreting the Linear Model

• Estimated rollover risk given a single-vehicle crash is halved when the SSF increases by 0.21

• For example, a vehicle with an SSF of 1.00 has twice the estimated rollover risk of a vehicle with an SSF of 1.21

Page 25: Analysis of Real-World Data

25

Specifying Logistic Model Forms

• Variables used– Individual explanatory variables or– Scenario risk variable

• Approach used with states – Model each state, and average the results or– Model pooled data with dummy variables to

capture state-to-state reporting differences

Page 26: Analysis of Real-World Data

26

Concept of Scenario Risk

• Data divided into cells defined by explanatory variables

• For each cell, scenario risk is rollovers per single-vehicle crash

• For each crash, scenario risk is adjusted to reflect rollovers per single-vehicle crash for all other crashes in the cell

• Idea is to use scenario risk in the logistic model in place of all the explanatory variables

Page 27: Analysis of Real-World Data

27

Fitting the Logistic Models

• Models from individual states were based on the explanatory variables available in that state

• Models from pooled data were limited to the explanatory variables available in all six states

Page 28: Analysis of Real-World Data

28

Results of the Logistic Models

• The models from the six individual states and the two models based on pooled data all fit the data well

• These models were consistent in showing a large and significant effect for SSF

Page 29: Analysis of Real-World Data

29

Predictions from the Logistic Models

• Logistic models describe the change in the log(odds) of rollover as a function of the change in the SSF

• Results can be used to predict the absolute rollover risk as a function of the SSF for a given set of conditions

• Here, estimates of average SSF and odds of rollover are based on the data as a whole

• The four summary models produce similar results

Page 30: Analysis of Real-World Data

30

Comparison of Linearand Logistic Models

• Linear and logistic models both suggest SSF has a large effect on rollover risk

• Next graph compares results of linear model with results of logistic model from pooled data with individual explanatory variables

Page 31: Analysis of Real-World Data

31

Predictions from the Models

0.00

0.10

0.20

0.30

0.40

0.50

0.90 1.00 1.10 1.20 1.30 1.40 1.50 1.60

Static Stability Factor

Ro

llove

rs p

er

Sin

gle

-Ve

hicl

e C

rash

Linear

Logistic

Page 32: Analysis of Real-World Data

32

Conclusions

• Advantages of linear model of summary data– All summary data can be shown– Simpler to explain

• Advantages of logistic analysis– Includes full range of values and interactions because

not restricted to averages for each vehicle group– Better for measuring effects of explanatory variables

because most were significant in the models

• In this analysis, logistic analysis appeared to confirm the general pattern of the linear results