Local traffic forecasting models

Research, The STANDARD Project

The majority of my PhD research was centred on developing road traffic forecasting models. I worked on the EPSRC funded STANDARD project (Spatio-temporal Analysis of Network Data and Route Dynamics), and was fortunate to have access to a large dataset of travel times provided by Transport for London, collected using automatic number plate recognition (ANPR) on London’s road network. Diagram 1 shows, in simple terms, how ANPR works.


Diagram 1 – Observing travel times using ANPR: a vehicle passes camera l1 at time t1, and its number plate is read. It then traverses link l and passes camera l2 at time t2 and its number plate is read again. The two number plates are matched using inbuilt software, and the TT is calculated as t2 – t1. Raw TTs are converted to UTTs by dividing by len(l), which is the length of link l. This figure is reproduced from the original article.

I was interested in short term travel time forecasting, attempting to answer the question: given knowledge of what travel times are like now, how will they develop over the next hour? The standard way to do this is to build a model that accounts for the dependency between past, current, and potential future values of travel time using knowledge gained from a large dataset of historical traffic patterns. Statistical time series models and neural networks are popular choices. What we found in the STANDARD project, and has been recognised elsewhere, is that using a single model is not always sufficient to model the variation in travel times across time and space. This is because of temporal nonstationarity and spatial heterogeneity. Therefore, I focussed a lot of my research effort on developing local models for travel time forecasting. To deal with the temporal nonstationarity I developed a kernel based model, with local kernels centred on each time point, called Local Online Kernel Ridge Regression (LOKRR). The idea is that these local kernels capture all the relevant historical information about travel times at particular times of day.


Diagram 2 – Concept of local kernels: For time point t, a kernel Kt is created of size 2w+1, where w is a window size

Diagram 2 shows the concept of local kernels graphically. The green point is the point to be forecast and the green box is the most recently observed travel time pattern. The red boxes indicate the data stored in the local kernel and the red dots represent the same time of day as the green point on previous days. The window size w is important because it captures the variability in travel times. For example, if one were to commute to work on the same road at approximately the same time each day, one may observe that the road tends to become congested at approximately the same time each day, and may be able to make statements such as “if I leave after 9am there is always too much traffic”, or “if I set off before 8am my journey is usually pretty quick”. However, there is usually significant variation around such trends. For instance, on some days a link may become congested earlier or later than usual; or the congestion may be slightly more or less severe, and one might find oneself making a statement such as “It’s especially busy today, something must have happened”, or “wow, it’s really quiet, it’s usually really busy by now”. These intuitive observations summarise the variability inherent in traffic data and are accounted for in the window.

The end result is a model that is better able to forecast the variability in travel times. Diagram 3 shows the performance of the model at forecasting travel times 1 hour ahead on a road link in Central London. Because LOKRR uses local knowledge of likely traffic conditions, it is better able to forecast the afternoon peak period than the comparison models.





Diagram 3 – Time series plots of the observed series (thick black line) against the forecast series 1 hour ahead on a) Wednesday 8th June b) 9th June c) 10th June 2011. Comparison models are Elman neural network (ANN), autoregressive integrated moving average (ARIMA) and support vector regression (SVR). These figures are reproduced from the original article.

This research is a good starting point. However, there are still many challenges that need to be addressed. For example, none of the models used here can forecast the large peak on the morning of the Wednesday in Diagram 3a. This is non-recurrent congestion, which may be caused by an incident, or a planned event. Often the occurrence and effect of such events is unpredictable, but we can model their spread using spatio-temporal approaches, which is the focus of ongoing research.

This post is based on an article entitled Local Online Kernel Ridge Regression for Forecasting of Urban Travel Times, published in Transportation Research Part C: Emerging Technologies. The article is open access, get the PDF at the link below.

PDF: Local Online Kernel Ridge Regression for Forecasting of Urban Travel Times.

DOI: 10.1016/j.trc.2014.05.015