Web-Based Innovation Indicators – Which Firm Website Characteristics Relate to Firm-Level Innovation Activity?
42 Pages Posted: 24 Feb 2020 Last revised: 4 Dec 2020
Date Written: 2019
Abstract
Web-based innovation indicators may provide new insights into firm-level innovation activities. However, little is known yet about the accuracy and relevance of web-based information. In this study, we use 4,485 German firms from the Mannheim Innovation Panel (MIP) 2019 to analyze which website characteristics are related to innovation activities at the firm level. Website characteristics are measured by several text mining methods and are used as features in different Random Forest classification models that are compared against each other. Our results show that the most relevant website characteristics are the website’s language, the number of subpages, and the total text length. Moreover, our website characteristics show a better performance for the prediction of product innovations and innovation expenditures than for the prediction of process innovations.
Keywords: text as data, innovation indicators, machine learning
JEL Classification: C53, C81, C83, O30
Suggested Citation: Suggested Citation