Title: Data analysis and prediction model discovery through decision tree learning
Authors: Parham Porouhan
Addresses: Graduate School of Information Technology, Siam University, Bangkok, Thailand
Abstract: Data mining and machine learning tools and approaches have been extensively used and practiced in bioinformatics with the purpose of analysing biomedical datasets. In this paper, we have chosen 'RapidMiner Studio' as a platform to create/generate 'decision tree models', which are obtained from a medical dataset. In research, 'decision trees' are considered as one of the most popular supervised (machine) learning techniques, which, as an instance can be effectively used for prediction of specific indicators/attributes affecting a disease. In general, this paper is divided into two main parts. In the first part of the study, an existing medical dataset with known label values (so-called, a training dataset) is used to identify and discover the hidden patterns. In the second part of the study, the generated/resulting patterns are used for making predictions. The results of the study provide groundwork for further and future studies.
Keywords: data mining; rapidminer studio; decision trees; predictive analytics; model discovery; medical dataset.
DOI: 10.1504/IJEDPO.2022.131214
International Journal of Experimental Design and Process Optimisation, 2022 Vol.7 No.1, pp.18 - 30
Received: 18 Jun 2022
Accepted: 17 Aug 2022
Published online: 01 Jun 2023 *