Financial Data Science: The Birth of a New Financial Research Paradigm Complementing Econometrics?
The European Journal of Finance, Forthcoming
21 Pages Posted: 14 May 2020
Date Written: August 2, 2019
In this paper, we compare and contrast financial data science with econometrics and conclude that the former is inevitably interdisciplinary due to the numerous skill-sets needed within a competitive research team. The latter, in contrast, is firmly rooted in economics. Both areas are highly complementary, as they share an equivalent process with the former’s intellectual point of departure being statistical inference and the latter’s being the data sets themselves. Two challenges arise, however, from the age of big data. First, the ever increasing computational power allows researchers to experiment with an extremely large number of generated test subjects and leads to the challenge of p-hacking. Second, the extremely large number of observations available in big data sets provide levels of statistical power at which common statistical significance levels are barely a challenge. We argue that the former challenge can be mitigated through adjustments for multiple hypothesis testing where appropriate. However, it can only truly be addressed via a strong focus on the integrity of the research process and the researchers themselves, with pre-registration and actual out-of-sample periods being the best technical though in themselves potentially insufficient tools. The latter challenge can be addressed in two ways. First, researchers can simply use more stringent statistical significance levels such as 0.1%, 0.5% and 1% instead of 1%, 5% and 10%, respectively. Second, and more importantly, researchers can use additional criteria such as economic significance, economic relevance and statistical relevance to assess the robustness of statistically significant coefficients. Especially statistical relevance seems crucial in the age of big data, as it appears not impossible for an individual coefficient to be considered statistically significant when its actual statistical relevance (i.e. incremental explanatory power) is extremely small.
Keywords: Big Data, Econometrics, Financial Data Science, Statistical Relevance, Statistical Significance Levels
JEL Classification: C10, C52, C55, G00, G20, G23
Suggested Citation: Suggested Citation