Self-Organizing Granular Encoding for Discrete Data Clustering

17 Pages Posted: 7 Mar 2025

See all articles by Qiang Fu

Qiang Fu

Queensland University of Technology

Yuefeng Li

Queensland University of Technology

Abstract

The success of a clustering model is not solely determined by the model and hyperparameters but also by how we handle and input various variables. Discrete variables are a common challenge in clustering. The Self Organizing Map (also known as the Kohonen Map or the SOM) is a type of artificial neural network that draws inspiration from biological neural system models, and it allows for the mapping of multidimensional data onto lower-dimensional spaces and works best with continuous data. Although it is possible to adapt the SOM for discrete or categorical data, they are generally less effective. This paper proposes a self-organizing granular (SOG) encoding for discrete data clustering, statistically validated using a $t$-test. Our experiments have shown that encoding can significantly enhance discrete data clustering in a neutral network-based clustering task. Furthermore, our proposed mapping technique has also demonstrated its effectiveness in improving other clustering models such as Affinity Propagation, HDBSCAN, GaussianMixture, and OPTICS.

Keywords: Granular Computing, Self-Organizing Map, Fuzzy Set, Discrete Data Clustering, Discrete Data Encoding

Suggested Citation

Fu, Qiang and Li, Yuefeng, Self-Organizing Granular Encoding for Discrete Data Clustering. Available at SSRN: https://ssrn.com/abstract=5169370 or http://dx.doi.org/10.2139/ssrn.5169370

Qiang Fu (Contact Author)

Queensland University of Technology ( email )

2 George Street
Brisbane, 4000
Australia

Yuefeng Li

Queensland University of Technology ( email )

Do you have a job opening that you would like to promote on SSRN?

Paper statistics

Downloads
16
Abstract Views
75
PlumX Metrics