Added value of time series models
In the SafetyNet project, many road traffic data are collected that consist of repeated measurements over time. Examples are the annual or monthly number of road traffic accidents in a country, its annual or monthly number of road traffic fatalities, its annual or monthly number of vehicle kilometres driven, its annual or monthly values on safety performance indicators, etc., all repeatedly measured over a certain period of time.
Generally, time series analysis can serve three purposes. First, time series analysis can be used to obtain an adequate description of the time series at hand. Second, explanatory variables other than time can be added to the model in order to obtain explanations for the development in the time series at hand. In SafetyNet, these explanatory variables are national exposure data, national safety performance indicators, and national road traffic safety measures. A third important application of time series analysis is the ability to predict or forecast further developments of a series into the (unknown) future. In traffic safety research, such forecasts can be used to assess whether future national safety targets are likely to be met, for example.
Whenever one is interested in studying and analysing such developments of one and the same phenomenon over time, special issues arise not encountered in cross-sectional data analysis. In these cases, statistical tests based on standard techniques like classical linear regression easily result in overoptimistic or even plain incorrect conclusions, due to the fact that the residuals obtained with these techniques usually do not satisfy the important model assumption of time independence. This is true irrespective of whether the interest lies in descriptive analysis, in explanatory analysis, or in forecasting.
The problem of time dependencies between the residuals in the classical linear regression analysis can be solved in a number of ways:
- additional explanatory variables can be added to the regression of the dependent variable on time such that the dependencies are removed from the residuals
- the relation between the dependent variable and time can be analysed with generalised linear models or non-linear models
- the dependent variable can be analysed with dedicated time series analysis techniques like ARIMA, DRAG and state space models
The application of dedicated time series analysis techniques is to be preferred, because they explicitly take the time dependencies between the observations into account, thus greatly improving the chances of obtaining residuals that do satisfy the model assumptions. As such these techniques allow of reliably testing whether the estimated relationships between dependent and independent variables in the analysis are statistically meaningful or not.
|