Title: Assessing classification complexity of datasets using fractals
Authors: André Luiz Marasca; Dalcimar Casanova; Marcelo Teixeira
Addresses: Graduate Program in Electrical Engineering (PPGEE), Federal University of Technology – Paraná, Pato Branco, Paraná, Brazil ' Graduate Program in Electrical Engineering (PPGEE), Federal University of Technology – Paraná, Pato Branco, Paraná, Brazil ' Graduate Program in Electrical Engineering (PPGEE), Federal University of Technology – Paraná, Pato Branco, Paraná, Brazil
Abstract: Supervised classification is a mechanism used in machine learning to associate classes with objects from datasets. Depending on the dimension and on the internal data structuring, classification may become complex. In this paper, we claim that the complexity level of a given dataset can be estimated by using fractal analysis. A novel fractal measure, called transition border, is proposed in order to estimate the chaos behind labelled points distribution. Their correlation with the success rate is tested by comparing it against results obtained from other supervised classification methods. Results suggest that this approach can be used to measure the complexity behind a classification task problem in real-valued datasets with three dimensions. The proposed method can also be useful for other science domains for which fractal analysis is applicable.
Keywords: supervised classification; fractal analysis; chaotic datasets; transition border; fractal dimension; complexity.
DOI: 10.1504/IJCSE.2019.103261
International Journal of Computational Science and Engineering, 2019 Vol.20 No.1, pp.102 - 119
Received: 27 Feb 2018
Accepted: 21 May 2018
Published online: 23 Oct 2019 *