Vector Autoregressive Models, Traffic Prediction, ITS, loop detectors, loop data, time series


Traffic data prediction is a critical aspect of Advanced Traffic Management System (ATMS). The utility of the traffic data is in providing information on the evolution of traffic process that can be passed on to the various users (commuters, Regional Traffic Management Centers (RTMCs), Department of Transportation (DoT), ... etc) for user-specific objectives. This information can be extracted from the data collected by various traffic sensors. Loop detectors collect traffic data in the form of flow, occupancy, and speed throughout the nation. Freeway traffic data from I-4 loop detectors has been collected and stored in a data warehouse called the Central Florida Data Warehouse (CFDW[trademark symbol]) by the University of Central Florida for the periods between 1993-1994 and 2000 - 2003. This data is raw, in the form of time stamped 30-second aggregated data collected from about 69 stations over a 36 mile stretch on I-4 from Lake Mary in the east to Disney-World in the west. This data has to be processed to extract information that can be disseminated to various users. Usually, most statistical procedures assume that each individual data point in the sample is independent of other data points. This is not true to traffic data as they are correlated across space and time. Therefore, the concept of time sequence and the layout of data collection devices in space, introduces autocorrelations in a single variable and cross correlations across multiple variables. Significant autocorrelations prove that past values of a variable can be used to predict future values of the same variable. Furthermore, significant cross-correlations between variables prove that past values of one variable can be used to predict future values of another variable. The traditional techniques in traffic prediction use univariate time series models that account for autocorrelations but not cross-correlations. These models have neglected the cross correlations between variables that are present in freeway traffic data, due to the way the data are collected. There is a need for statistical techniques that incorporate the effect of these multivariate cross-correlations to predict future values of traffic data. The emphasis in this dissertation is on the multivariate prediction of traffic variables. Unlike traditional statistical techniques that have relied on univariate models, this dissertation explored the cross-correlation between multivariate traffic variables and variables collected across adjoining spatial locations (such as loop detector stations). The analysis in this dissertation proved that there were significant cross correlations among different traffic variables collected across very close locations at different time scales. The nature of cross-correlations showed that there was feedback among the variables, and therefore past values can be used to predict future values. Multivariate time series analysis is appropriate for modeling the effect of different variables on each other. In the past, upstream data has been accounted for in time series analysis. However, these did not account for feedback effects. Vector Auto Regressive (VAR) models are more appropriate for such data. Although VAR models have been applied to forecast economic time series models, they have not been used to model freeway data. Vector Auto Regressive models were estimated for speeds and volumes at a sample of two locations, using 5-minute data. Different specifications were fit--estimation of speeds from surrounding speeds; estimation of volumes from surrounding volumes; estimation of speeds from volumes and occupancies from the same location; estimation of speeds from volumes from surrounding locations (and vice versa). These specifications were compared to univariate models for the respective variables at three levels of data aggregation (5-minutes, 10 minutes, and 15 minutes) in this dissertation. For data aggregation levels of [less than]15 minutes, the VAR models outperform the univariate models. At data aggregation level of 15 minutes, VAR models did not outperform univariate models. Since VAR models were used for all traffic variables reported by the loop detectors, this made the application of VAR a true multivariate procedure for dynamic prediction of the multivariate traffic variables--flow, speed and occupancy. Also, VAR models are generally deemed more complex than univariate models due to the estimation of multiple covariance matrices. However, a VAR model for k variables must be compared to k univariate models and VAR models compare well with AutoRegressive Integrated Moving Average (ARIMA) models. The added complexity helps model the effect of upstream and downstream variables on the future values of the response variable. This could be useful for ATMS situations, where the effect of traffic redistribution and redirection is not known beforehand with prediction models. The VAR models were tested against more traditional models and their performances were compared against each other under different traffic conditions. These models significantly enhance the understanding of the freeway traffic processes and phenomena as well as identifying potential knowledge relating to traffic prediction. Further refinements in the models can result in better improvements for forecasts under multiple conditions.


If this is your thesis or dissertation, and want to learn how to access it or for more information about readership statistics, contact us at

Graduation Date



Al-Deek, Haitham


Doctor of Philosophy (Ph.D.)


College of Engineering and Computer Science


Civil and Environmental Engineering

Degree Program

Civil Engineering








Release Date

May 2009

Length of Campus-only Access


Access Status

Doctoral Dissertation (Open Access)