Empirical study of feature selection methods over classification algorithms Online publication date: Tue, 08-May-2018
by N. Bhalaji; K.B. Sundhara Kumar; Chithra Selvaraj
International Journal of Intelligent Systems Technologies and Applications (IJISTA), Vol. 17, No. 1/2, 2018
Abstract: Feature selection methods are deployed in machine-learning algorithms for reducing the redundancy in the dataset and to increase the clarity in the system models without loss of much information. The objective of this paper is to investigate the performance of feature selection methods when they are exposed to different datasets and different classification algorithms. In this paper, we have investigated standard parameters such as accuracy, precision and recall over two feature selection algorithms namely Chi-Square feature selection and Boruta feature selection algorithms. Observations of the experiments conducted using R studio resulted around 5-6% increased performance in above said parameters when they were exposed to Boruta feature selection algorithm. The experiment was done on two different datasets with different set of features and we have used the following five standard classification algorithms - Naive Bayes, decision tree, support vector machines (SVM), random forest and gradient boosting.
Existing subscribers:
Go to Inderscience Online Journals to access the Full Text of this article.
If you are not a subscriber and you just want to read the full contents of this article, buy online access here.Complimentary Subscribers, Editors or Members of the Editorial Board of the International Journal of Intelligent Systems Technologies and Applications (IJISTA):
Login with your Inderscience username and password:
Want to subscribe?
A subscription gives you complete access to all articles in the current issue, as well as to all articles in the previous three years (where applicable). See our Orders page to subscribe.
If you still need assistance, please email subs@inderscience.com