A Simple Method for Unsupervised Anomaly Detection: An Application to Web Time Series Data
29 Pages Posted: 2 Jul 2021 Last revised: 4 Oct 2021
Date Written: October 4, 2021
Abstract
We propose a simple anomaly detection method that is applicable to unlabeled time series data and is sufficiently tractable, even for non-technical entities, by using the density ratio estimation based on the state space model. Our detection rule is based on the likelihood ratio estimated by the dynamic linear model, i.e. the ratio of likelihood in our model to that in an over-dispersed model that we will call the NULL model. Using the Yahoo S5 data set and the Numenta Anomaly Benchmark (NAB) data set, publicly available and commonly used benchmark data sets, we find that our method achieves better or comparable performance compared to the existing methods. The result implies that it is essential in time series anomaly detection to incorporate the specific information on time series data into the model. In addition, we apply the proposed method to unlabeled Web time series data, specifically, daily page view and average session duration data on an electronic commerce site that deals in insurance goods to show the applicability of our method to unlabeled real-world data. We find that the increase in page view caused by e-mail newsletter deliveries is less likely to contribute to completing an insurance contract. The result also suggests the importance of the simultaneous monitoring of more than one time series.
Keywords: Unsupervised Anomaly Detection, Dynamic Linear Model, Density Ratio Estimation, Web Time Series Data
Suggested Citation: Suggested Citation