Automated Time Series Forecasting for Biosurveillance

Statistics in Medicine, Forthcoming

Robert H. Smith School Research Paper No. RHS 06-035

26 Pages Posted: 11 Aug 2006

See all articles by Howard Burkom

Howard Burkom

Johns Hopkins University, Applied Physics Laboratory

Sean Patrick Murphy

Johns Hopkins University, Applied Physics Laboratory

Abstract

For robust detection performance, alerting algorithms for biosurveillance require input data free of trends, day-of-week effects, and other systematic behavior. Time series forecasting methods may be used to remove this behavior by subtracting forecasts from observations to form residuals for algorithmic input. We describe three forecast methods and compare their predictive accuracy on each of 16 authentic syndromic data streams. The methods are (1) a nonadaptive loglinear regression model using a long historical baseline, (2) an adaptive regression model with a shorter, sliding baseline, and (3) the Holt-Winters method for generalized exponential smoothing. Criteria for comparing the forecasts were the rootmean-square error, the median absolute percent error (MedAPE) and the median absolute deviation. The median-based criteria showed best overall performance for the Holt-Winters method. The MedAPE measures over the 16 test series averaged 16.5, 11.6, and 9.7 for the nonadaptive regression, adaptive regression, and Holt-Winters methods, respectively. The nonadaptive regression forecasts were degraded by changes from the data behavior in the fixed baseline period used to compute model coefficients. The mean-based criterion was less conclusive because of the effects of poor forecasts on a small number of calendar holidays. The Holt-Winters method was also most effective at removing serial autocorrelation, with most 1-day-lag autocorrelation coefficients below 0.15. The forecast methods were compared without tuning them to the behavior of individual series. We achieved improved predictions with such tuning of the Holt-Winters method, but practical use of such improvements for routine surveillance will require reliable data classification methods.

Keywords: forecasting, regression, biosurveillance, exponential smoothing, time series, preconditioning

Suggested Citation

Burkom, Howard and Murphy, Sean Patrick, Automated Time Series Forecasting for Biosurveillance. Statistics in Medicine, Forthcoming; Robert H. Smith School Research Paper No. RHS 06-035. Available at SSRN: https://ssrn.com/abstract=923635

Howard Burkom (Contact Author)

Johns Hopkins University, Applied Physics Laboratory ( email )

Baltimore, MD 21218
United States

Sean Patrick Murphy

Johns Hopkins University, Applied Physics Laboratory ( email )

Baltimore, MD 21218
United States

Register to save articles to
your library

Register

Paper statistics

Downloads
293
rank
96,761
Abstract Views
2,076
PlumX