Title: Remote homology detection using GA and NSGA-II on physicochemical properties
Authors: Mukti Routray; Niranjan Kumar Ray
Addresses: Department of Computer Science and Engineering, Silicon Institute of Technology, Bhubaneswar, Odisha, India ' School of Computer Engineering, KIIT Deemed to be University, Bhubaneswar, Odisha, India
Abstract: Remote homology detection at amino acid level is a complex problem in the area of computational biology. We have used machine learning algorithms to predict the homology of un-annotated protein sequences which can save time and cost. This work is divided in three phases. Initially the features are extracted from protein sequences using Principal Component Analysis (PCA) to build a chromosome set with representative features of each protein based on physicochemical properties. Second stage involves GA for the construction of a set of chromosomes for classification based on PCA and initialises the classifier to build up an error matrix. Third stage uses NSGA-II, crossover and mutation, and tournament selection for the next set of chromosomes. The output of this experiment is a set of minimum classification error values and minimum number of features used for classification of protein families. This approach gives superior accuracy over the profile-based methods.
Keywords: PCA; principal component analysis; feature selection and classification; genetic algorithm; profile-based methods.
DOI: 10.1504/IJCAT.2020.112688
International Journal of Computer Applications in Technology, 2020 Vol.64 No.4, pp.393 - 402
Received: 11 Feb 2020
Accepted: 12 Jun 2020
Published online: 28 Jan 2021 *