Greedily assemble tandem repeats for next generation sequences Online publication date: Mon, 11-Nov-2019
by Yongqing Jiang; Jinhua Lu; Jingyu Hou; Wanlei Zhou
International Journal of High Performance Computing and Networking (IJHPCN), Vol. 15, No. 1/2, 2019
Abstract: Eukaryotic genomes contain high volumes of intronic and intergenic regions in which repetitive sequences are abundant. These repetitive sequences represent challenges in genomic assignment of short read sequences generated through next generation sequencing and are often excluded in analysis losing invaluable genomic information. Here we present a method, known as tandem repeat assembler (TRA), for the assembly of repetitive sequences by constructing contigs directly from paired-end reads. Using an experimentally acquired data set for human chromosome 14, tandem repeats >200 bp were assembled. Alignment of the contigs to the human genome reference (GRCh38) revealed that 84.3% of tandem repetitive regions were correctly covered. For tandem repeats, this method outperformed state-of-the-art assemblers by generating correct N50 of contigs up to 512 bp.
Existing subscribers:
Go to Inderscience Online Journals to access the Full Text of this article.
If you are not a subscriber and you just want to read the full contents of this article, buy online access here.Complimentary Subscribers, Editors or Members of the Editorial Board of the International Journal of High Performance Computing and Networking (IJHPCN):
Login with your Inderscience username and password:
Want to subscribe?
A subscription gives you complete access to all articles in the current issue, as well as to all articles in the previous three years (where applicable). See our Orders page to subscribe.
If you still need assistance, please email subs@inderscience.com