Time and the Value of Data

38 Pages Posted: 27 Aug 2020

See all articles by Ehsan Valavi

Ehsan Valavi

Harvard Business School

Joel Hestness

Cerebras Systems

Newsha Ardalani

Baidu - Baidu Research

Marco Iansiti

Harvard University - Business School (HBS)

Date Written: August 25, 2020

Abstract

This paper investigates the effectiveness of time-dependent data in improving the quality of AI-based products and services. Time-dependency means that data loses its relevance to problems over time. This loss causes deterioration in the algorithm's performance and, thereby, a decline in created business value. We model time-dependency as a shift in the probability distribution and derive several counter-intuitive results. We, theoretically, prove that even an infinite amount of data collected over time may have limited substance for predicting the future, and an algorithm that is trained on a current dataset of bounded size can attain a similar performance. Moreover, we prove that increasing data volume by including older datasets may put a company in a disadvantageous position. Having these results, we answer questions on how data volume creates a competitive advantage. We argue that time-dependency weakens the barrier to entry that data volume creates for a business. So much that competing firms equipped with a limited, but sufficient, amount of current data can attain better performance. This result, together with the fact that older datasets may deteriorate algorithms' performance, casts doubt on the significance of first-mover advantage in AI-based markets. We complement our theoretical results with an experiment. In the experiment, we empirically measure the value loss in text data for the next word prediction task. The empirical measurements confirm the significance of time dependency and value depreciation in AI-based businesses. For example, after seven years, 100MB of text data becomes as useful as 50MB of current data for the next word prediction task.

Keywords: economics of AI, machine learning, non-stationarity, perishability, value depreciation

Suggested Citation

Valavi, Ehsan and Hestness, Joel and Ardalani, Newsha and Iansiti, Marco, Time and the Value of Data (August 25, 2020). Harvard Business School Strategy Unit Working Paper No. 21-016, Available at SSRN: https://ssrn.com/abstract=3680910 or http://dx.doi.org/10.2139/ssrn.3680910

Ehsan Valavi (Contact Author)

Harvard Business School ( email )

Soldiers Field Road
Morgan 270C
Boston, MA 02163
United States

Joel Hestness

Cerebras Systems ( email )

Newsha Ardalani

Baidu - Baidu Research ( email )

China

Marco Iansiti

Harvard University - Business School (HBS) ( email )

Soldiers Field Road
Morgan 270C
Boston, MA 02163
United States

Here is the Coronavirus
related research on SSRN

Paper statistics

Downloads
30
Abstract Views
190
PlumX Metrics