Title: New gene selection algorithm using hypeboxes to improve performance of classifiers
Authors: Adil M. Bagirov; Karim Mardaneh
Addresses: Faculty of Science and Technology, Federation University Australia, University Drive, Mount Helen, P.O. Box 663, Victoria, Australia ' Faculty of Science and Technology, Federation University Australia, University Drive, Mount Helen, P.O. Box 663, Victoria, Australia
Abstract: The use of DNA microarray technology allows to measure the expression levels of thousands of genes in one single experiment which makes possible to apply classification techniques to classify tumours. However, the large number of genes and relatively small number of tumours in gene expression datasets may (and in some cases significantly) diminish the accuracy of many classifiers. Therefore, efficient gene selection algorithms are required to identify most informative genes or groups of genes to improve the performance of classifiers. In this paper, a new gene selection algorithm is developed using marginal hyberboxes of genes or groups of genes for each tumour type. Informative genes are defined using overlaps between hyberboxes. The results on six gene expression datasets demonstrate that the proposed algorithm is able to considerably reduce the number of genes and significantly improve the performance of classifiers.
Keywords: DNA microarray technology; gene expression; gene selection; classification.
DOI: 10.1504/IJBRA.2020.109102
International Journal of Bioinformatics Research and Applications, 2020 Vol.16 No.3, pp.269 - 289
Accepted: 03 Feb 2018
Published online: 20 Aug 2020 *