Classification of Cervical Cancer Dataset

Avishek Choudhury, Wesabi, Classification of Cervical Cancer Dataset, Proceedings of the 2018 IISE Annual Conference, Orlando, p. 1456-1461

6 Pages Posted: 23 Dec 2018

See all articles by Y. M. S. Al-Wesabi

Y. M. S. Al-Wesabi

Binghamton University

Avishek Choudhury

Stevens Institute of Technology

Daehan Won

Binghamton University

Multiple version iconThere are 2 versions of this paper

Date Written: December 7, 2018

Abstract

Cervical cancer is the leading gynecological malignancy worldwide. This paper presents diverse classification techniques and shows the advantage of feature selection approaches to the best predicting of cervical cancer disease. There are thirty-two attributes with eight hundred and fifty-eight samples. Besides, this data suffers from missing values and imbalance data. Therefore, over-sampling, under-sampling and embedded over and under sampling have been used. Furthermore, dimensionality reduction techniques are required for improving the accuracy of the classifier. Therefore, feature selection methods have been studied as they divided into two distinct categories, filters and wrappers. The results show that age, first sexual intercourse, number of pregnancies, smokes, hormonal contraceptives, and STDs: genital herpes are the main predictive features with high accuracy with 97.5%. Decision Tree classifier is shown to be advantageous in handling classification assignment with excellent performance.

Keywords: Cervical cancer, feature selection, classification, imbalanced data, over-sampling

Suggested Citation

Al-Wesabi, Y. M. S. and Choudhury, Avishek and Won, Daehan, Classification of Cervical Cancer Dataset (December 7, 2018). Avishek Choudhury, Wesabi, Classification of Cervical Cancer Dataset, Proceedings of the 2018 IISE Annual Conference, Orlando, p. 1456-1461. Available at SSRN: https://ssrn.com/abstract=3297573

Y. M. S. Al-Wesabi

Binghamton University

PO Box 6001
Binghamton, NY 13902-6000
United States

Avishek Choudhury (Contact Author)

Stevens Institute of Technology ( email )

NJ
United States

Daehan Won

Binghamton University

PO Box 6001
Binghamton, NY 13902-6000
United States

Register to save articles to
your library

Register

Paper statistics

Downloads
30
Abstract Views
169
PlumX Metrics