Title: Mining hub genes from RNA-Seq gene expression data using biclustering algorithm
Authors: Ankush Maind; Shital Raut
Addresses: Department of Computer Science and Engineering, Visvesvaraya National Institute of Technology, Nagpur, Maharashtra, India ' Department of Computer Science and Engineering, Visvesvaraya National Institute of Technology, Nagpur, Maharashtra, India
Abstract: Biclustering is a popularly used data mining technique for the analysis of gene expression data. Recently, multiple biclustering algorithms have been designed for finding co-expressed genes from the microarray gene expression data. Microarray data has some drawbacks. To overcome the drawbacks of microarray data, RNA-Seq technology was introduced. RNA-Seq technology is the advanced high throughput technique. In this paper, we have introduced a new approach for identifying hub genes from the RNA-Seq data using biclustering algorithm. For mining biclusters, efficient 'runibic' biclustering algorithm is used. The 'runibic' algorithm performs well on various issues such as overlapping, noise, stable output, accuracy, large-scale data, and biological significance. For each significant bicluster, we have constructed a gene co-expression network (GCN). Further, each constructed GCN used for identifying hub genes. The identified hub genes are specific to the subsets of experimental conditions. The extracted hub genes can be useful in the several clinical applications as prognostic or diagnostic markers of the diseases.
Keywords: biclustering; RNA-Seq data; data mining; bioinformatics; gene co-expression network; hub gene; biomarker.
DOI: 10.1504/IJDMB.2019.099728
International Journal of Data Mining and Bioinformatics, 2019 Vol.22 No.2, pp.171 - 193
Received: 17 Apr 2018
Accepted: 03 Apr 2019
Published online: 20 May 2019 *