Information Leakage in Backtesting

10 Pages Posted: 4 May 2021 Last revised: 20 May 2021

See all articles by Johannes Ruf

Johannes Ruf

London School of Economics & Political Science (LSE) - London School of Economics

Weiguan Wang

Shanghai University

Date Written: May 19, 2021

Abstract

Testing the performance of statistical models with historical time series requires a careful handling of the data. Even if a dataset is seemingly completely separated in an in-sample and an out-of-sample set information may be leaked. Such leakage can lead to a significant overestimation of the out-of-sample performance of a predictive model. We provide experimental evidence to illustrate how randomised data splits lead to overfitting in the presence of time series structure. The experiment is set up in the framework of option replication, with real-world and simulated data.

Keywords: Data snooping; Hedging; Information leakage; Overfitting; Pseudo real-time; Time series

JEL Classification: G13, C45

Suggested Citation

Ruf, Johannes and Wang, Weiguan, Information Leakage in Backtesting (May 19, 2021). Available at SSRN: https://ssrn.com/abstract=3836631 or http://dx.doi.org/10.2139/ssrn.3836631

Johannes Ruf

London School of Economics & Political Science (LSE) - London School of Economics ( email )

United Kingdom

Weiguan Wang (Contact Author)

Shanghai University ( email )

Shanghai
China

Do you have a job opening that you would like to promote on SSRN?

Paper statistics

Downloads
92
Abstract Views
593
rank
352,316
PlumX Metrics