Structural variation calling and genotyping by moment-based deep convolutional neural networks Online publication date: Thu, 05-Aug-2021
by Timothy Becker; Dong-Guk Shin
International Journal of Data Mining and Bioinformatics (IJDMB), Vol. 25, No. 1/2, 2021
Abstract: Structural Variation (SV) calling and genotyping remain an ongoing challenge using next generation sequencing technologies. The gold standard approach for genome consortia has been to utilise multiple SV calling algorithms and then merge the results based on SV type and coordinates and more recently to make use of multiple sequencing technologies for each sample cell line. This ensemble strategy provides more comprehensive SV calling but comes at the cost of high-compute run time. We make use of popular open-source machine learning libraries to formulate a new data representation suitable for mining whole genome sequences in a fraction of the ensemble time. We then compare the results to several well-established methods and ensembles. Our pure machine learning method demonstrates a new direction in technique, where feature selection and region filtering are no longer required to achieve desirable false positive rates.
Existing subscribers:
Go to Inderscience Online Journals to access the Full Text of this article.
If you are not a subscriber and you just want to read the full contents of this article, buy online access here.Complimentary Subscribers, Editors or Members of the Editorial Board of the International Journal of Data Mining and Bioinformatics (IJDMB):
Login with your Inderscience username and password:
Want to subscribe?
A subscription gives you complete access to all articles in the current issue, as well as to all articles in the previous three years (where applicable). See our Orders page to subscribe.
If you still need assistance, please email subs@inderscience.com