Title: EGA-FMC: enhanced genetic algorithm-based fuzzy k-modes clustering for categorical data

Authors: Medhini Narasimhan; Balaji Balasubramanian; Suryansh D. Kumar; Nagamma Patil

Addresses: Department of Information Technology, National Institute of Technology Karnataka, Surathkal, India ' Department of Information Technology, National Institute of Technology Karnataka, Surathkal, India ' Department of Information Technology, National Institute of Technology Karnataka, Surathkal, India ' Department of Information Technology, National Institute of Technology Karnataka, Surathkal, India

Abstract: Categorical data clustering is the unsupervised technique of grouping similar objects which have categorical attributes. We propose a genetic algorithm-based fuzzy k-modes categorical data clustering algorithm using multi-objective rank-based selection with enhanced elitism operation. Compactness of the clusters and inter-cluster separation were chosen as objectives to be optimised. During elitism, in every iteration, the best parent chromosomes were identified. The entire population was passed through the selection, crossover and mutation steps. The worst children were then replaced by the best parents. Our method was evaluated on three real-world datasets and resulted in clusters of better quality as compared to current methods with a significant reduction in computation time. Additionally, statistical significance tests were conducted to show the superiority of our approach over other clustering solutions.

Keywords: genetic algorithms; categorical data clustering; multi-objective optimisation; elitism; fuzzy clustering.

DOI: 10.1504/IJBIC.2018.092801

International Journal of Bio-Inspired Computation, 2018 Vol.11 No.4, pp.219 - 228

Received: 29 Nov 2016
Accepted: 02 Feb 2018

Published online: 29 Jun 2018 *

Full-text access for editors Full-text access for subscribers Purchase this article Comment on this article