A Bootstrap Evaluation of the Effect of Data Splitting on Financial Time Series
Brandeis University - International Business School
Stern School of Business, New York University
This article exposes problems of the commonly used technique of splitting the available data into training, validation, and test sets that are held fixed, warns about drawing too strong conclusions from such static splits, and shows potential pitfalls of ignoring variability across splits. Using a bootstrap or resampling method, we compare the uncertainty in the solution stemming from the data splitting with neural network specific uncertainties (parameter initialization, choice of number of hidden units, etc.). We present two results on data from the New York Stock Exchange. First, the variation due to different resamplings is significantly larger than the variation due to different network conditions. This result implies that it is important to not over-interpret a model, or an ensemble of models, estimated on one specific split of the data. Second, on each split, the neural network solution with early stopping is very close to a linear model; no significant nonlinearities are extracted.
Number of Pages in PDF File: 8
JEL Classification: G1, C5working papers series
Date posted: January 22, 1997
© 2013 Social Science Electronic Publishing, Inc. All Rights Reserved.
This page was processed by apollo1 in 1.188 seconds