Set Cover Feature Selection for Text Categorisation and spam detection Online publication date: Thu, 25-Jun-2009
by Elias F. Combarro, Jose Ranilla, Manuel Roberto Berdasco, Elena Montanes, Irene Diaz
International Journal of Advanced Intelligence Paradigms (IJAIP), Vol. 1, No. 4, 2009
Abstract: In this paper the performance of the Set Cover (SC) Feature Selection (FS) method for Text Categorisation (TC) and Spam Detection problems is studied. Several variants of the original method are presented either to overcome the drawback of the unbalanced problems which are usually present in TC or to increase the efficiency. The behaviour of the algorithm is tested on several collections. The experiments show these methods provide a great reduction in the dimensionality of the problem either keeping the effectiveness of the classification or causing just a slight decrease.
Existing subscribers:
Go to Inderscience Online Journals to access the Full Text of this article.
If you are not a subscriber and you just want to read the full contents of this article, buy online access here.Complimentary Subscribers, Editors or Members of the Editorial Board of the International Journal of Advanced Intelligence Paradigms (IJAIP):
Login with your Inderscience username and password:
Want to subscribe?
A subscription gives you complete access to all articles in the current issue, as well as to all articles in the previous three years (where applicable). See our Orders page to subscribe.
If you still need assistance, please email subs@inderscience.com