Efficient Random Subspace Decision Forests with A Simple Probability Dimensionality Setting Scheme
36 Pages Posted: 21 Sep 2022
Abstract
The random subspace concept is widely used in decision forests. However, there is not a reasonable approach to specify the appropriate number of randomly selected features. Previous random subspace decision forests simply specify the random subspace dimensionality d s to a preset function value such as [log 2 d ] or [√ d ] where d is the number of the total feature dimensions. Actually, this preset value may be a relatively poor choice for high dimensional data sets. In this paper, a novel framework named Efficient Random Subspace decision forest (ERS) is proposed which pays attention to how to set the random subspace dimensionality for each decision tree of decision forests. Specifically, a simple discrete uniform distribution is employed to set with probability the number of randomly selected features for each tree in random subspace decision forests. The proposed Half Range Discrete Uniform distribution-based Varied Dimensionality setting (HRDUVD) method removes the hesitation of how to preset an appropriate d s for different data sets but also results in good enough classification performance along with relatively short running time. Plenty of experiments on public benchmark data sets have demonstrated the effectiveness and the efficiency of the proposed ERS.
Keywords: decision forests, ensemble classifiers, random subspace method
Suggested Citation: Suggested Citation