Title: Augmentation of predictive competence of non-small cell lung cancer datasets through feature pre-processing techniques
Authors: M. Sumalatha; Latha Parthiban
Addresses: Department of Computer Science, Periyar University, Salem, Tamil Nadu, India ' Department of Computer Science, Pondicherry University Community College, Puducherry, India
Abstract: Non-small cell lung cancer (NSCLC) comprised of complex hidden and unknown data that is challenging for prediction at the earlier stage. The major objective of the research paper is to develop a novel preprocessing model based on minimisation of features and competency maximisation through feature pre-processing (FPP) to provide augmentation in predictive competence of NSCLC datasets. In Phase-I, the test for relevancy identified behavioural errors like null, empty and NAN values to reduce two features. In Phase-II, regression analysis was performed to find the relationship between features after which four features were removed. In Phase-III, cluster analysis is carried out to find the irrelevant features in the form of clusters and seven features are removed. The competency of NSCLC dataset before FPP showed more accuracy than after FPP with classifiers like simple tree, complex tree, linear SVM, Gaussian SVM, weighted KNN and boosted tree classifiers.
Keywords: non-small cell lung cancer; NSCLC; competency of prediction; relevancy analysis; regression analysis; cluster analysis; feature pre-processing model; feature pre-processing; FPP; competency analytics.
DOI: 10.1504/IJESMS.2023.129985
International Journal of Engineering Systems Modelling and Simulation, 2023 Vol.14 No.2, pp.86 - 100
Received: 02 Aug 2021
Accepted: 17 Nov 2021
Published online: 04 Apr 2023 *