Title: A novel method to measure the semantic similarity of HPO terms
Authors: Jiajie Peng; Hansheng Xue; Yukai Shao; Xuequn Shang; Yadong Wang; Jin Chen
Addresses: School of Computer Science, Northwestern Polytechnical University, Xi'an, China ' School of Computer Science and Technology, Harbin Institute of Technology, Shenzhen, China ' School of Computer Science and Technology, Harbin Institute of Technology, Shenzhen, China ' School of Computer Science, Northwestern Polytechnical University, Xi'an, China ' School of Computer Science and Technology, Harbin Institute of Technology, Harbin, China ' Institute of Biomedical Informatics, College of Medicine, University of Kentucky, Lexington, KY 40536, USA
Abstract: It is critical yet remains to be challenging to make precise disease diagnosis from complex clinical features and highly heterogeneous genetic background. Recently, phenotype similarity has been effectively applied to model patient phenotype data. However, the existing measurements are revised based on the Gene Ontology-based term similarity models, which are not optimised for human phenotype ontologies. We propose a new similarity measure called PhenoSim. Our model includes a noise reduction component to model the noisy patient phenotype data, and a path-constrained Information Content-based method for phenotype semantics similarity measurement. Evaluation tests compared PhenoSim with four existing approaches. It showed that PhenoSim, could effectively improve the performance of HPO-based phenotype similarity measurement, thus increasing the accuracy of phenotype-based causative gene prediction and disease prediction.
Keywords: human phenotpe ontology; semantic similarity; phenotype similarity; noise reduction; causative gene prediction; disease prediction.
DOI: 10.1504/IJDMB.2017.084268
International Journal of Data Mining and Bioinformatics, 2017 Vol.17 No.2, pp.173 - 188
Received: 10 Mar 2017
Accepted: 14 Mar 2017
Published online: 22 May 2017 *