Feature Extraction and Classification Analysis of High-Dimensional Biological Data Based on Dimensionality Reduction Fusion Method
17 Pages Posted: 7 Sep 2023
Abstract
Identification and extraction of characterized information from complex high-dimensional biological data is a very meaningful issue. The dimensionality reduction fusion method based on random forest, feature extraction and neural network is proposed to recognize and classify two datasets of mRNA and lncRNA. It is shown that the proposed fusion method achieved accurate identification/classification of cancer and non-cancer groups, and simultaneously selected identity variables that have biological relevance to lung cancer (tumor) as potential biomarkers from a large number of variables. It is considered as an effective tool and theoretical support for lung cancer identification in clinical application, and it can be extended to other kinds of cancer or biological data. Ultimately, an advanced method for feature extraction and classification analysis of high-dimensional data is provided.
Note:
Funding declaration: This study is supported by National College Students Innovation and Entrepreneurship Training Program (X2023-097).
Conflict of Interests: The authors declare that they have no known competing financial interests or personal relationships that could have appeared to influence the work reported in this paper.
Keywords: High-dimensional data, Classification, Dimensionality reduction, Feature selection, Tumor biomarker
Suggested Citation: Suggested Citation