Title: Selection of 'K' in K-means clustering using GA and VMA
Authors: Sanjay Chakraborty; Subham Raj; Shreya Garg
Addresses: Computer Science and Engineering Department, Institute of Engineering and Management, Kolkata, India ' Computer Science and Engineering Department, Institute of Engineering and Management, Kolkata, India ' Computer Science and Engineering Department, Institute of Engineering and Management, Kolkata, India
Abstract: The K-means algorithm is the most widely used partitional clustering algorithms. In spite of several advances in K-means clustering algorithm, it suffers in some drawbacks like initial cluster centres, stuck in local optima etc. The initial guessing of cluster centres lead to the bad clustering results in K-means and this is one of the major drawbacks of K-means algorithm. In this paper, a new strategy is proposed where we have blended K-means algorithm with genetic algorithm (GA) and volume metric algorithm (VMA) to predict the best value of initial cluster centres, which is not in the case of only K-means algorithm. The paper concludes with the analysis of the results of using the proposed measure to determine the number of clusters for the K-means algorithm for different well-known datasets from UCI machine learning repository.
Keywords: clustering; initial cluster centres; K-means; GA; VMA; volume metric algorithm.
International Journal of Data Science, 2019 Vol.4 No.1, pp.63 - 81
Received: 07 Jun 2017
Accepted: 04 Nov 2017
Published online: 18 Mar 2019 *