Title: Random forest with SMOTE and ensemble feature selection for cervical cancer diagnosis
Authors: Anjali Kuruvilla; B. Jayanthi
Addresses: School of Computer Studies, Rathnavel Subramaniam College of Arts and Science, Coimbatore, 641 402, Tamil Nadu, India ' School of Computer Studies, Rathnavel Subramaniam College of Arts and Science, Coimbatore, 641 402, Tamil Nadu, India
Abstract: Cervical tumours are a leading cause of death worldwide, although they can be prevented by removing afflicted tissues early on. Recognising population weaknesses is necessary for inclusive cervical screening programs. STDs and smoking cause cervical cancer. Creating a cancer classifier requires complex learning. FS decreases a prediction system's inputs. Reducing model parameters and time improves performance. The goal is to create a new ensemble feature selection (EFS) and classifier for cervical cancer diagnosis. EFS, several FSs used. EFS mixes the results of single FS approaches, including entropy elephant herding optimisation (EEHO), entropy elephant herding optimisation (EBFO), and recursive feature elimination (RFE), to improve results. Bootstrap aggregates EFS results. Classifier approach is Random Forest with SMOT (SMOTE). UCI's cancer database has 32 features and four classes. Classification performance is calculated using a confusion matrix and precision, recall, f-measure, and accuracy. The classification algorithms use MATLAB. The proposed algorithm gives an enhanced accuracy value of 94.7552%, 94.5221%, 94.8718%, and 94.2890% for the Hinselmann, Schiller, Citology, and Biopsy tests, respectively.
Keywords: cervical cancer; EFS; ensemble feature selection; entropy elephant herding optimisation; entropy elephant herding optimisation; EBFO; entropy butterfly optimisation algorithm; RFE; recursive feature elimination; RF; random forest; SMOTE; synthetic minority oversampling technique; classification.
DOI: 10.1504/IJCBDD.2023.130318
International Journal of Computational Biology and Drug Design, 2023 Vol.15 No.4, pp.289 - 315
Received: 06 Aug 2022
Accepted: 06 Oct 2022
Published online: 17 Apr 2023 *