Statistical analysis of electrophoresis time series for improving basecalling in DNA sequencing Online publication date: Sat, 12-Apr-2008
by Anna Tonazzini, Luigi Bedini
International Journal of Signal and Imaging Systems Engineering (IJSISE), Vol. 1, No. 1, 2008
Abstract: In automated DNA sequencing, the final algorithmic phase, referred to as basecalling, consists of the translation of four time signals in the form of peak sequences (electropherogram) into the corresponding sequence of bases. Commercial basecallers detect the peaks based on heuristics, and are very efficient when the peaks are distinct and regular in spread, amplitude and spacing. Unfortunately, in practice, the signals are subject to several degradations, among which peak superposition and peak merging are the most frequent. In these cases the experiment must be repeated and human intervention is required. Recently, there have been attempts to provide methodological foundations to the problem and to use statistical models for solving it. In this paper, we exploit a-priori information and Bayesian estimation to remove degradations and recover the signals in an impulsive form which makes basecalling straightforward.
Existing subscribers:
Go to Inderscience Online Journals to access the Full Text of this article.
If you are not a subscriber and you just want to read the full contents of this article, buy online access here.Complimentary Subscribers, Editors or Members of the Editorial Board of the International Journal of Signal and Imaging Systems Engineering (IJSISE):
Login with your Inderscience username and password:
Want to subscribe?
A subscription gives you complete access to all articles in the current issue, as well as to all articles in the previous three years (where applicable). See our Orders page to subscribe.
If you still need assistance, please email subs@inderscience.com