Title: Multi-kernel LS-SVM based integration bio-clinical data analysis and application to ovarian cancer
Authors: Jaya Thomas; Lee Sael
Addresses: Department of Computer Science, SUNY Korea, Incheon, South Korea ' Department of Computer Science, SUNY Korea, Incheon, South Korea; Department of Computer Science, Stony Brook University, NY 11794, USA
Abstract: The medical research facilitates to acquire a diverse data types from the same individual for a particular cancer. Major challenge is how to integratively analyse the multiple data types. In this paper, we introduce a multiple kernel based pipeline for integrative analysis of four genomic data and a set of clinical data. In the pipeline, multiple-kernel is generated from the weighted sum of individual kernels and is used to stratify patients and predict clinical outcomes. We apply the pipeline on ovarian cancer data from TCGA and examine intra similarities of clinical factors of each subtype and calculate log-rank statistics to verify how well they cluster. We also examined the power of molecular and clinical data in predicting dichotomised overall survival data and tumour grade. It was observed that the integration of various data types yields better stratification and higher prediction accuracy as compared to using individual data types.
Keywords: integrative analysis; least squares multi-kernel; bio-clinical data; ovarian cancer; LS-SVM; kernel k-means; heterogeneous data; cancer stratification; prognostic prediction.
DOI: 10.1504/IJDMB.2017.089281
International Journal of Data Mining and Bioinformatics, 2017 Vol.19 No.2, pp.150 - 167
Received: 30 Aug 2017
Accepted: 05 Sep 2017
Published online: 11 Jan 2018 *