Title: EGA-FMC: enhanced genetic algorithm-based fuzzy k-modes clustering for categorical data
Authors: Medhini Narasimhan; Balaji Balasubramanian; Suryansh D. Kumar; Nagamma Patil
Addresses: Department of Information Technology, National Institute of Technology Karnataka, Surathkal, India ' Department of Information Technology, National Institute of Technology Karnataka, Surathkal, India ' Department of Information Technology, National Institute of Technology Karnataka, Surathkal, India ' Department of Information Technology, National Institute of Technology Karnataka, Surathkal, India
Abstract: Categorical data clustering is the unsupervised technique of grouping similar objects which have categorical attributes. We propose a genetic algorithm-based fuzzy k-modes categorical data clustering algorithm using multi-objective rank-based selection with enhanced elitism operation. Compactness of the clusters and inter-cluster separation were chosen as objectives to be optimised. During elitism, in every iteration, the best parent chromosomes were identified. The entire population was passed through the selection, crossover and mutation steps. The worst children were then replaced by the best parents. Our method was evaluated on three real-world datasets and resulted in clusters of better quality as compared to current methods with a significant reduction in computation time. Additionally, statistical significance tests were conducted to show the superiority of our approach over other clustering solutions.
Keywords: genetic algorithms; categorical data clustering; multi-objective optimisation; elitism; fuzzy clustering.
DOI: 10.1504/IJBIC.2018.092801
International Journal of Bio-Inspired Computation, 2018 Vol.11 No.4, pp.219 - 228
Received: 29 Nov 2016
Accepted: 02 Feb 2018
Published online: 29 Jun 2018 *