Title: Effect of various factors on classification performance of ordinal logistic regression

Authors: Ali Vasfi Ağlarcı; Cengiz Bal

Addresses: Department of Biostatistics, Faculty of Medicine, Kastamonu University, Turkey ' Department of Biostatistics, Faculty of Medicine, Eskişehir Osmangazi University, Eskişehir 26010, Turkey

Abstract: The classification problem is the way in which a new observation belongs to a set of categories, using known features. For example, categorising e-mails as necessary or unnecessary, or finding a diagnosis of a disease using a patient's various values (such as gender, blood pressure, presence of various symptoms). Various methods are used in classification processes. In this study, the classification performance of ordinal logistic regression, which is a statistical method, was investigated. It has been revealed how the classification success of the method changes when the data set properties change. For this, a simulation study was carried out by deriving data sets with different properties with the help of the R program. As a result of the simulation study, it was observed that the correlation structure in the data set, the sample size, the number and distribution of the response variable categories affected the classification performance of the method. Suggestions have been made to improve the classification performance of the ordinal logistic regression method.

Keywords: statistical learning; classification; ordinal data; simulation.

DOI: 10.1504/IJDMMM.2024.138813

International Journal of Data Mining, Modelling and Management, 2024 Vol.16 No.2, pp.196 - 208

Received: 11 Jan 2023
Accepted: 29 May 2023

Published online: 31 May 2024 *

Full-text access for editors Full-text access for subscribers Purchase this article Comment on this article