Title: Subtree selection in kernels for graph classification
Authors: Mehmet Tan; Faruk Polat; Reda Alhajj
Addresses: Department of Computer Engineering, TOBB University of Economics and Technology, Ankara, Turkey ' Department of Computer Engineering, Middle East Technical University, Ankara, Turkey ' Department of Computer Science, University of Calgary, Calgary, AB, Canada
Abstract: Classification of structured data is essential for a wide range of problems in bioinformatics and cheminformatics. One such problem is in silico prediction of small molecule properties such as toxicity, mutagenicity and activity. In this paper, we propose a new feature selection method for graph kernels that uses the subtrees of graphs as their feature sets. A masking procedure which boils down to feature selection is proposed for this purpose. Experiments conducted on several data sets as well as a comparison of our method with some frequent subgraph based approaches are presented.
Keywords: feature selection; graph kernels; bioinformatics; cheminformatics; subtree selection; graph classification.
DOI: 10.1504/IJDMB.2013.056080
International Journal of Data Mining and Bioinformatics, 2013 Vol.8 No.3, pp.294 - 310
Received: 03 May 2011
Accepted: 04 May 2011
Published online: 20 Oct 2014 *