Title: New algorithms for inferring gene regulatory networks from time-series expression data on Apache Spark
Authors: Yasser Abduallah; Jason T.L. Wang
Addresses: Department of Computer Science, New Jersey Institute of Technology, Newark, New Jersey 07102, USA ' Department of Computer Science, New Jersey Institute of Technology, Newark, New Jersey 07102, USA
Abstract: Gene regulatory networks (GRNs) are crucial to understand the inner workings of the cell and the complexity of gene interactions. Numerous algorithms have been developed to infer GRNs from gene expression data. As the number of identified genes increases and the complexity of their interactions is uncovered, gene networks become cumbersome to test. Furthermore, prodding through experimental results requires an enormous amount of computation, resulting in slow data processing. Therefore, new approaches are needed to analyse copious amounts of experimental data from cellular GRNs. To meet this need, cloud computing is promising as reported in the literature. Here we present two new algorithms for reverse engineering GRNs in a cloud environment. The algorithms, implemented in Spark, employ an information-theoretic approach to infer GRNs from time-series gene expression data. Experimental results show that one of our new algorithms is faster than, yet as accurate as, two existing cloud-based GRN inference methods.
Keywords: network inference; systems biology; spark; big data; MapReduce; gene regulatory networks; GRN; time-series; gene expression; big data intelligence.
DOI: 10.1504/IJBDI.2019.100881
International Journal of Big Data Intelligence, 2019 Vol.6 No.3/4, pp.153 - 162
Received: 15 Feb 2018
Accepted: 16 May 2018
Published online: 19 Jul 2019 *