Title: An improved algorithm to handle noise objects in the process of clustering
Authors: Hasanthi A. Pathberiya; Chandima D. Tilakaratne; Liwan L. Hansen
Addresses: Department of Statistics, University of Sri Jayewardenepura, Nugegoda, Sri Lanka ' Department of Statistics, University of Colombo, Colombo, Sri Lanka ' School of Computing, Engineering and Mathematics, Western Sydney University, Locked Bay 1797, Penrith NSW 2751, Australia
Abstract: Cluster analysis is considered as an approach for unsupervised learning. It tends to recognise hidden grouping structure in a set of objects using a predefined set of rules. Objects occupying unusual characteristics add noise to the data space. As a result, complexities and misinterpretation in clustering structures will arise. This study aims at proposing a novel iterative approach to eradicate the effect of noise objects in the process of deriving clusters of data. Performance of the proposed approach is tested on partitioning, hierarchical and neural network based clustering algorithms using both simulated and standard datasets supplemented with noise. An improvement in the quality of clustering structure resulted from the proposed approach is witnessed, compared to that of conventional clustering algorithms.
Keywords: clustering algorithms; handling noise data; mining methods and algorithms; k-means; Ward's method; self organising map.
International Journal of Data Science, 2019 Vol.4 No.1, pp.1 - 17
Received: 29 Dec 2016
Accepted: 01 Sep 2017
Published online: 18 Mar 2019 *