An Optimization Method for Characterizing Two Groups of Data
20 Pages Posted: 7 Apr 2020
Date Written: March 13, 2020
Abstract
Feature selection is to choose a subset of features, out of a set of candidate features, such that the selected set best represents the whole in a particular aspect. We present a bi-objective optimization model for a feature selection problem in the context of data grouping. The aim is to select a set of features that has the smallest size and maximizes the similarities between samples of the same group and the differences between samples of different groups. We propose a lexicographic solution method and prove several properties of the problem. We show that even obtaining feasible solutions for the problem can be challenging, and we therefore develop efficient matheuristic algorithms. We test our algorithms on 136 datasets ranging from medium to large, including 11 real-world ones. We show that the proposed matheuristics can deliver quality solutions in a reasonable amount of time.
Keywords: optimization, bi-objective, matheuristics
Suggested Citation: Suggested Citation