KinPred-RN: Kinase Activity Inference and Cancer Type Classification Using Machine Learning on RNA-Seq Data
29 Pages Posted: 11 Sep 2023 Publication Status: Published
More...Abstract
Kinases are an important class of enzymes that can transfer phosphate groups from high-energy and phosphate-donating molecules to specific substrates and play essential roles in various cellular processes. In particular, kinase activities have been shown to be specific biomarkers for certain types of cancer. While novel algorithms have been developed to calculate kinase activities from phosphorylated proteomics data, these methods can be costly and require valuable samples. Furthermore, methods for extracting kinase activities from bulk RNA sequence data have not yet been developed. In this study, we propose a novel computational framework, KinPred-RNA, for extracting specific kinase activities from bulk RNA-sequencing data obtained from cancer samples. Our approach outperforms existing models in predicting kinase activities from bulk RNA sequencing data in cancer conditions. We used the efficient gene signatures of the LINCS-L1000 dataset as input to KinPred-RNA, and the eXtreme Gradient Boosting (XGboost) algorithm to predict kinase activities. Notably, our model outperforms other methods such as linear regression and random forest in predicting kinase activities from bulk RNA-seq data. We applied KinPred-RNA to tissue samples from various cancer types, including invasive breast carcinoma, hepatocellular carcinoma, lung squamous cell carcinoma, glioblastoma multiforme, and uterine corpus endometrial carcinoma. Our results show that KinPred-RNA achieves an average R2 above the 0.5 threshold in predicting kinase activity. Our model outperforms other machine learning methods, making it a powerful tool for predicting kinase activities and linking them to specific biological functions. In conclusion, our proposed framework could facilitate the identification and prognosis of cancer, providing a valuable tool for future research.
Note:
Funding declaration: This study was supported by National Science and Technology Council, Taiwan (NSTC 112-2321-B-A49-016).
Conflict of Interests: We declare no competing interests.
Keywords: Kinase activity, bulk RNA-sequence technology, LINCS-L1000, XGBoost algorithm
Suggested Citation: Suggested Citation