Title: New cluster ensemble approach to integrative biological data analysis
Authors: Natthakan Iam-On; Tossapon Boongoen; Simon Garrett; Chris Price
Addresses: School of Information Technology, Mae Fah Luang University, 57100, Thailand ' Department of Mathematics and Computer Science, Royal Thai Air Force Academy, 10220, Thailand ' Aispire Consulting Ltd., Tanyrallt, Aberystwyth, SY23 3PG, UK ' Department of Computer Science, Aberystwyth University, SY23 3DB, UK
Abstract: Clinical data has been employed as the major factor for traditional cancer prognosis. However, this classic approach may be ineffective for analysing morphologically indistinguishable tumour subtypes. As such, microarray technology emerges as the promising alternative. Despite a large number of microarray studies, the actual clinical application of gene expression data analysis remains limited owing to the complexity of generated data and the noise level. Recently, the integrative cluster analysis of both clinical and gene expression data has been shown to be an effective alternative to overcome the above-mentioned problems. This paper presents a novel method for using cluster ensembles that is accurate for analysing heterogeneous biological data. Evaluation against real biological and benchmark data sets suggests that the quality of the proposed model is higher than many state-of-the-art cluster ensemble techniques and standard clustering algorithms.
Keywords: clustering; cluster ensembles; heterogeneous biological data; link analysis; gene expression data; data analysis; cluster analysis; clinical data; bioinformatics; cancer prognosis; tumour subtypes.
DOI: 10.1504/IJDMB.2013.055495
International Journal of Data Mining and Bioinformatics, 2013 Vol.8 No.2, pp.150 - 168
Received: 02 May 2011
Accepted: 02 May 2011
Published online: 20 Oct 2014 *