Title: Machine learning methods for predicting the biological activities of molecules in high diverse databases
Authors: Faisal Saeed
Addresses: College of Computer Science and Engineering, Taibah University, Medina, Saudi Arabia
Abstract: In-silico drug discovery methods use the principle of similar property, which indicates that similar biological activities are exhibited in structurally similar compounds. Therefore, new drugs were discovered using the biological activities prediction methods that depend on the structures of chemical compounds. Several computational methods have been used for this purpose. However, the previous studies showed that the prediction of biological activities for heterogeneous molecules is still a challenge. This paper used several machine methods and different combinations of ensemble learning methods to enhance the performance of predicting molecular activities. In this study, a heterogeneous subset from the MDL Drug Data Report (MDDR) dataset has been used. The results showed the performances of several methods, which have been discussed to recommend the best machine learning and ensemble methods for this kind of diverse chemical datasets.
Keywords: biological activities; chemical compounds; chemical informatics; ensemble methods; machine learning methods.
DOI: 10.1504/IJICT.2022.124833
International Journal of Information and Communication Technology, 2022 Vol.21 No.2, pp.170 - 180
Received: 17 Oct 2020
Accepted: 05 Nov 2020
Published online: 09 Aug 2022 *