Title: Application of consensus string matching in the diagnosis of allelic heterogeneity involving transposition mutation
Authors: Fatema Tuz Zohora; M. Sohel Rahman
Addresses: A?EDA Group, Department of Computer Science and Engineering, Bangladesh University of Engineering and Technology, Dhaka, Bangladesh ' A?EDA Group, Department of Computer Science and Engineering, Bangladesh University of Engineering and Technology, Dhaka, Bangladesh
Abstract: In this paper, an algorithm is proposed that detects the existence of a common ancestor gene sequence for non-overlapping transposition metric given two input DNA sequences. We consider two cases: fixed length transposition and all length transposition. For the first one, the algorithm has the time complexity of O(n³), where n is the length of input sequences. In case of all length transposition, theoretical worst case time complexity of the algorithm is proven to be O(n4). However, practically the worst case and the average case time complexity for all length transposition are found to be O(n³) and O(n²) respectively. This work is motivated by the purpose of diagnosing unknown genetic disease that shows allelic heterogeneity, a case where a normal gene mutates in different orders resulting in two different gene sequences causing two different genetic diseases. The algorithm can be useful as well in the study of breed-related hereditary to determine the genetic spread of a defective gene in the population.
Keywords: consensus string matching; closest string; transposition mutation; genetic diseases; allelic heterogeneity; iterative algorithms; graph; bioinformatics; common ancestor gene sequences; DNA sequences; breed-related hereditary; genetic spread; defective genes.
DOI: 10.1504/IJDMB.2015.072756
International Journal of Data Mining and Bioinformatics, 2015 Vol.13 No.4, pp.360 - 377
Received: 15 Jan 2015
Accepted: 25 Jan 2015
Published online: 28 Oct 2015 *