Title: StruLocPred: structure-based protein subcellular localisation prediction using multi-class support vector machine
Authors: Wengang Zhou; Julie A. Dickerson
Addresses: Bioinformatics and Computational Biology Program, Electrical and Computer Engineering Department, Virtual Reality Applications Center, Iowa State University, Ames, IA 50011, USA ' Bioinformatics and Computational Biology Program, Electrical and Computer Engineering Department, Virtual Reality Applications Center, Iowa State University, Ames, IA 50011, USA
Abstract: Knowledge of protein subcellular locations can help decipher a protein's biological function. This work proposes new features: sequence-based: Hybrid Amino Acid Pair (HAAP) and two structure-based: Secondary Structural Element Composition (SSEC) and solvent accessibility state frequency. A multi-class Support Vector Machine is developed to predict the locations. Testing on two established data sets yields better prediction accuracies than the best available systems. Comparisons with existing methods show comparable results to ESLPred2. When StruLocPred is applied to the entire Arabidopsis proteome, over 77% of proteins with known locations match the prediction results. An implementation of this system is at http://wgzhou.ece. iastate.edu/StruLocPred/.
Keywords: protein subcellular localisation; structural features; multi-class SVM; support vector machine; arabidopsis proteome; StruLocPred server; protein function; bioinformatics.
DOI: 10.1504/IJDMB.2012.048173
International Journal of Data Mining and Bioinformatics, 2012 Vol.6 No.2, pp.130 - 143
Received: 06 Aug 2009
Accepted: 31 May 2010
Published online: 17 Dec 2014 *