H-Hrgan: Knowledge Graph-Driven Representation for Missing Value Imputation

17 Pages Posted: 21 Apr 2025

See all articles by Hanlin Deng

Hanlin Deng

Harbin Normal University

Guohui Zhou

Harbin Normal University

Yong Deng

Chengdu University of Technology

Wei He

Harbin Normal University

Hailong Zhu

Harbin Normal University

Yanling Cui

Harbin Normal University

Abstract

Missing data is a pervasive challenge in real-world applications, often arising from non-responses, sensor failures, or system inconsistencies. While numerous imputation techniques have been proposed, most are designed specifically for continuous variables and tend to perform poorly when applied to categorical data. In structured formats, such as tabular datasets, categorical variables exhibit intricate semantic relationships that are inadequately captured by conventional methods relying on one-hot encodings or statistical heuristics. To address this issue, we represent each discrete feature value as a node and treat attributes as relations, constructing a knowledge graph. We then propose the Heterogeneous-Homogeneous Relational Graph Attention Network (H-HRGAN), a novel framework for imputing missing categorical values. A hierarchical graph structure is constructed to separately capture heterogeneous attribute-value relations and homogeneous co-occurrence patterns. Additionally, a relational graph attention mechanism is employed to perform multi-level reasoning over this structure. Through this mechanism, we leverage a Graph Neural Network (GNN) framework to achieve more rational feature aggregation, leading to improved predictive performance on categorical data. Extensive experiments on multiple real-world datasets demonstrate that H-HRGAN outperforms state-of-the-art imputation methods, particularly under high missingness rates and complex dependency scenarios.

Keywords: Data imputation Knowledge graph Knowledge representationGraph neural network

Suggested Citation

Deng, Hanlin and Zhou, Guohui and Deng, Yong and He, Wei and Zhu, Hailong and Cui, Yanling, H-Hrgan: Knowledge Graph-Driven Representation for Missing Value Imputation. Available at SSRN: https://ssrn.com/abstract=5224705 or http://dx.doi.org/10.2139/ssrn.5224705

Hanlin Deng

Harbin Normal University ( email )

Harbin
China

Guohui Zhou (Contact Author)

Harbin Normal University ( email )

Harbin
China

Yong Deng

Chengdu University of Technology ( email )

Wei He

Harbin Normal University ( email )

Harbin
China

Hailong Zhu

Harbin Normal University ( email )

Harbin
China

Yanling Cui

Harbin Normal University ( email )

Harbin
China

Do you have a job opening that you would like to promote on SSRN?

Paper statistics

Downloads
4
Abstract Views
57
PlumX Metrics