Self-Organizing Granular Encoding for Discrete Data Clustering
17 Pages Posted: 7 Mar 2025
Abstract
The success of a clustering model is not solely determined by the model and hyperparameters but also by how we handle and input various variables. Discrete variables are a common challenge in clustering. The Self Organizing Map (also known as the Kohonen Map or the SOM) is a type of artificial neural network that draws inspiration from biological neural system models, and it allows for the mapping of multidimensional data onto lower-dimensional spaces and works best with continuous data. Although it is possible to adapt the SOM for discrete or categorical data, they are generally less effective. This paper proposes a self-organizing granular (SOG) encoding for discrete data clustering, statistically validated using a $t$-test. Our experiments have shown that encoding can significantly enhance discrete data clustering in a neutral network-based clustering task. Furthermore, our proposed mapping technique has also demonstrated its effectiveness in improving other clustering models such as Affinity Propagation, HDBSCAN, GaussianMixture, and OPTICS.
Keywords: Granular Computing, Self-Organizing Map, Fuzzy Set, Discrete Data Clustering, Discrete Data Encoding
Suggested Citation: Suggested Citation