Utilizing support vector machine in real-time crash risk evaluation
Abbreviated Journal Title
Accid. Anal. Prev.
Support vector machine model; Bayesian logistic regression; Real-time; crash risk evaluation; Mountainous freeway safety; VARIABLE-SPEED LIMITS; LOGISTIC-REGRESSION; INJURY SEVERITY; TRAFFIC; SAFETY; MODELS; FREEWAYS; FREQUENCY; NETWORKS; INCIDENT; Ergonomics; Public, Environmental & Occupational Health; Social; Sciences, Interdisciplinary; Transportation
Real-time crash risk evaluation models will likely play a key role in Active Traffic Management (ATM). Models have been developed to predict crash occurrence in order to proactively improve traffic safety. Previous real-time crash risk evaluation studies mainly employed logistic regression and neural network models which have a linear functional form and over-fitting drawbacks, respectively. Moreover, these studies mostly focused on estimating the models but barely investigated the models' predictive abilities. In this study, support vector machine (SVM), a recently proposed statistical learning model was introduced to evaluate real-time crash risk. The data has been split into a training dataset (used for developing the models) and scoring datasets (meant for assessing the models' predictive power). Classification and regression tree (CART) model has been developed to select the most important explanatory variables and based on the results, three candidates Bayesian logistic regression models have been estimated with accounting for different levels unobserved heterogeneity. Then SVM models with different kernel functions have been developed and compared to the Bayesian logistic regression model. Model comparisons based on areas under the ROC curve (AUC) demonstrated that the SVM model with Radial-basis kernel function outperformed the others. Moreover, several extension analyses have been conducted to evaluate the effect of sample size on SVM models' predictive capability; the importance of variable selection before developing SVM models; and the effect of the explanatory variables in the SVM models. Results indicate that (1) smaller sample size would enhance the SVM model's classification accuracy, (2) variable selection procedure is needed prior to the SVM model estimation, and (3) explanatory variables have identical effects on crash occurrence for the SVM models and logistic regression models. (C) 2012 Elsevier Ltd. All rights reserved.
Accident Analysis and Prevention
"Utilizing support vector machine in real-time crash risk evaluation" (2013). Faculty Bibliography 2010s. 4900.