Title: Protein sub-cellular localisation prediction by analysis of short-range residue correlations
Authors: Jian Guo, Yuanlie Lin, Zhirong Sun
Addresses: Laboratory of Statistical Computing and Bioinformatics, Department of Mathematical Sciences, Tsinghua University, Beijing 100084, PR China. ' Laboratory of Statistical Computing and Bioinformatics, Department of Mathematical Sciences, Tsinghua University, Beijing 100084, PR China. ' MOE Key Lab of Bioinformatics, State Key Lab of Biomembrane and Membrane Biotechnology, Institute of Bioinformatics, Department of Biological Sciences and Biotechnology, Tsinghua University, Beijing 100084, China
Abstract: Sub-cellular localisation performs an important role in genome analysis. This paper describes a new residue-couple model using a support vector machine to predict the sub-cellular localisation of proteins. This new approach provides better predictions than the existing methods. The total prediction accuracies on Reinhardt and Hubbard|s dataset reach 92.0% for prokaryotic protein sequences and 86.9% for eukaryotic protein sequences with fivefold cross validation. For a new dataset with 8304 proteins located in eight sub-cellular locations, the total accuracy achieves 88.9%. Meanwhile, the model shows robust against N-terminal errors in the sequences.
Keywords: sub-cellular localisation; residue-couple model; short-range residue correlations; support vector machine; bioinformatics; proteins; prokaryotic protein sequences; eukaryotic protein sequences; genome sequencing.
DOI: 10.1504/IJBRA.2006.009762
International Journal of Bioinformatics Research and Applications, 2006 Vol.2 No.2, pp.105 - 118
Published online: 09 May 2006 *
Full-text access for editors Full-text access for subscribers Purchase this article Comment on this article