Knowledge Graph Embeddings for Dealing with Concept Drift in Machine Learning
39 Pages Posted: 20 Jan 2021 Publication Status: Accepted
Stream learning has been largely studied for extracting knowledge structures from continuous and rapid data records. However, efforts to understand whether knowledge representation and reasoning are useful for addressing concept drift 1 , one of the core challenges from the stream learning community, particularly those due to dramatic changes in knowledge, have been limited and scattered. In this work, we propose to study the problem in the context of the semantic representation of data streams in the Semantic Web, i.e., ontology streams. Such streams are ordered sequences of data annotated with an ontological schema. A fundamental challenge is to understand what knowledge should be encoded and how it can be integrated with stream learning methods. To address this, we show that at least three levels of knowledge encoded in ontology streams are needed to deal with concept drifts: (i) existence of novel knowledge gained from stream dynamics, (ii) significance of knowledge change and evolution, and (iii) (in)consistency of knowledge evolution. We propose an approach to encoding such knowledge via schema-enabled knowledge graph embeddings through a combination of novel representations: entailment vectors, entailment weights, and a consistency vector. We illustrate our approach on supervised classification tasks. Our main findings are that: (i) It is possible to develop a general purpose framework to address concept drifts in ontology streams by coupling any machine learning classification algorithms with our proposed schema-enabled knowledge graph embeddings method; (ii) Our proposed method is robust to significant concept drift (up to 51% of stream update ratio) and out-performs state of the art methods with 12% to 35% improvement on the Macro-F1 score in the tested scenarios; (iii) Only a small part of the ontological entailment (less than 20%) play an important role in determining the consistency between two snapshots; (iv) Predictions with consistent models outperform those with inconsistent models by over 300% in the two use cases. Our findings could help future work on applications of stream learning, such as autonomous driving, which demand high accuracy of stream learning in the presence of sudden and disruptive changes.
Keywords: Ontology, Stream Learning, Concept Drift, Knowledge Graph, Semantic Embedding
Suggested Citation: Suggested Citation