Hybrid ARDL-MIDAS-Transformer Time-Series Regressions for Multi-Topic Crypto Market Sentiment Driven by Price and Technology Factors
57 Pages Posted: 21 Aug 2021 Last revised: 3 Sep 2021
Date Written: August 19, 2021
A novel hybrid Autoregressive Distributed Lag Mixed Data Sampling (ARDL-MIDAS) model is developed that integrates a combination of both deep neural network multi-head attention Transformer mechanisms and sophisticated stochastic text time-series feature and covariate constructions into a mixed-frequency time-series regression model that incorporates long memory structure. The resulting class of ARDL-MIDAS-Transformer models allows one to maintain the interpretability of the time-series models whilst exploiting the deep neural network attention architectures for higher-order interaction analysis, or, as in our use case, for design of Instrumental Variables to reduce bias in the estimation of the infinite lag ARDL-MIDAS model.
In this regard, a statistical time-series analysis on mixed data frequencies is undertaken to discover the relationship between various sentiment extraction frameworks, technology factors, and the role that price discovery has on retail cryptocurrency sentiment (crypto sentiment). This is an interesting time-series modelling challenge as it involves working with time-series regression models in which the time-series response process, and the regression time-series covariates, are observed at different time scales. The sentiment indices constructed for a variety of topics and news sources are produced as a collection of time-series capturing the daily sentiment polarity signals for each ``topic'', namely each particular market or crypto asset. Different sentiment methods are developed in a time-series context, and utilised in the proposed hybrid regression framework.
In terms of modelling, both ARDL models within the infinite-lag Koyck transform model family, and a MIDAS regression model with a Gegenbauer long memory structure are combined to produce a novel class of infinite-lag, long memory MIDAS model time-series regression structure. This joint model is further enhanced with the higher-order feature extraction methods of BERT and VADER.
In addition to the proposed modelling methodology, a detailed real data study is conducted to explore the relationship between daily crypto market sentiment (positive, negative and neutral polarity) and intra-daily (hourly) price log-return dynamics of crypto markets. Furthermore, technology time-series factors are introduced to capture network effects, such as the hash rate which is an important aspect of money supply relating to mining of new crypto assets, and block hashing for transaction verification.
Keywords: MIDAS; Transformer; multi-scale resolution data; sentiment modelling; natural language processing; Gegenbauer long memory
Suggested Citation: Suggested Citation