Title: Effect of various factors on classification performance of ordinal logistic regression
Authors: Ali Vasfi Ağlarcı; Cengiz Bal
Addresses: Department of Biostatistics, Faculty of Medicine, Kastamonu University, Turkey ' Department of Biostatistics, Faculty of Medicine, Eskişehir Osmangazi University, Eskişehir 26010, Turkey
Abstract: The classification problem is the way in which a new observation belongs to a set of categories, using known features. For example, categorising e-mails as necessary or unnecessary, or finding a diagnosis of a disease using a patient's various values (such as gender, blood pressure, presence of various symptoms). Various methods are used in classification processes. In this study, the classification performance of ordinal logistic regression, which is a statistical method, was investigated. It has been revealed how the classification success of the method changes when the data set properties change. For this, a simulation study was carried out by deriving data sets with different properties with the help of the R program. As a result of the simulation study, it was observed that the correlation structure in the data set, the sample size, the number and distribution of the response variable categories affected the classification performance of the method. Suggestions have been made to improve the classification performance of the ordinal logistic regression method.
Keywords: statistical learning; classification; ordinal data; simulation.
DOI: 10.1504/IJDMMM.2024.138813
International Journal of Data Mining, Modelling and Management, 2024 Vol.16 No.2, pp.196 - 208
Received: 11 Jan 2023
Accepted: 29 May 2023
Published online: 31 May 2024 *