On-chain Analytics for Sentiment-driven Statistical Causality in Cryptocurrencies

Posted: 11 Feb 2021 Last revised: 22 Nov 2021

See all articles by Ioannis Chalkiadakis

Ioannis Chalkiadakis

Institut des Systèmes Complexes de Paris Île-de-France / CNRS - UAR 3611

Anna Zaremba

affiliation not provided to SSRN

Gareth Peters

University of California Santa Barbara; University of California, Santa Barbara

Mike J. Chantler

Heriot-Watt University - Department of Computer Science

Date Written: December 8, 2020

Abstract

This paper establishes a new framework for assessing multimodal statistical causality between cryptocurrency market (cryptomarket) sentiment and cryptocurrency price processes. In order to achieve this we present an efficient algorithm for multimodal statistical causality analysis based on Multiple-Output Gaussian Processes. Signals from different information sources (modalities) are jointly modelled as a Multiple-Output Gaussian Process, and then using a novel approach to statistical causality based on Gaussian Processes (GP), we study linear and non-linear causal effects between the different modalities. We demonstrate the effectiveness of our approach in a machine learning application studying the relationship between cryptocurrency spot price dynamics and sentiment time-series data specific to the crypto sector, which we conjecture influences retail investor behaviour. The investor sentiment is extracted from cryptomarket news data via methods developed in the area of statistical machine learning known as Natural Language Processing (NLP). To capture sentiment, we present a novel framework for text to time-series embedding, which we then use to construct a sentiment index from publicly available news articles. We conduct a statistical analysis of our sentiment statistical index model and compare it to alternative state-of-the-art sentiment models popular in the NLP literature. In regards to the multimodal causality, the investor sentiment is our primary modality of exploration, in addition to price and a blockchain technology-related indicator (hash rate). Analysis shows that our approach is effective in modelling causal structures of variable degree of complexity between heterogeneous data sources, and illustrates the impact that certain modelling choices for the different modalities can have on detecting causality. A solid understanding of these factors is necessary to gauge cryptocurrency adoption by retail investors and provide sentiment- and technology-based insights about the cryptocurrency market dynamics.

Keywords: Multiple-Output Gaussian Process, Granger causality, sentiment index, sentiment analysis, text mining, multimodal systems, heterogeneous data, cryptocurrencies, cryptocoin markets, natural language processing

Suggested Citation

Chalkiadakis, Ioannis and Zaremba, Anna and Peters, Gareth and Chantler, Michael John, On-chain Analytics for Sentiment-driven Statistical Causality in Cryptocurrencies (December 8, 2020). Available at SSRN: https://ssrn.com/abstract=3742063 or http://dx.doi.org/10.2139/ssrn.3742063

Ioannis Chalkiadakis (Contact Author)

Institut des Systèmes Complexes de Paris Île-de-France / CNRS - UAR 3611 ( email )

113 Rue Nationale
Paris, 75013
France

HOME PAGE: http://www.iscpif.fr/

Anna Zaremba

affiliation not provided to SSRN

Gareth Peters

University of California Santa Barbara ( email )

Santa Barbara, CA 93106
United States

University of California, Santa Barbara ( email )

Michael John Chantler

Heriot-Watt University - Department of Computer Science

Edinburgh
United Kingdom

Do you have a job opening that you would like to promote on SSRN?

Paper statistics

Abstract Views
880
PlumX Metrics