Title: On a solution for the high-dimensionality-small-sample-size regression problem with several different microarrays
Authors: Vladimir Nikulin
Addresses: Department of Mathematical Methods in Economy, Vyatka State University, Kirov, 610000, Russia
Abstract: A common phenomenon in biological experiments is that it is not possible to obtain complete measurements for all the samples. Note that some microarrays are very informative, but very expensive to have them for all the samples. However, we can use publicly available background knowledge about the potential links between the components of different microarrays (known, also, as genes). As a result, we shall translate all the selected genes in the terms of other genes. Those secondary genes are to be included in the regression models automatically to give the learning processes the right initial directions. The proposed method was tested online during the e-LICO data-mining Contest, where we had achieved second best score.
Keywords: microarrays; regression modelling; LOO; leave-one-out; relevance vector machines; regularisation; random permutations; learning; bioinformatics; secondary genes.
DOI: 10.1504/IJDMB.2014.060049
International Journal of Data Mining and Bioinformatics, 2014 Vol.9 No.3, pp.221 - 234
Received: 07 Aug 2011
Accepted: 29 Dec 2011
Published online: 21 Oct 2014 *