Title: An improved method for k-means clustering based on internal validity indexes and inter-cluster variance
Authors: Guangli Zhu; Xiaoqing Li; Shunxiang Zhang; Xin Xu; Biao Zhang
Addresses: School of Computer Science and Engineering, Anhui University of Science & Technology, Huainan, China ' School of Computer Science and Engineering, Anhui University of Science & Technology, Huainan, China ' School of Computer Science and Engineering, Anhui University of Science & Technology, Huainan, China ' School of Computer Science and Engineering, Anhui University of Science & Technology, Huainan, China ' School of Computer Science and Engineering, Anhui University of Science & Technology, Huainan, China
Abstract: It is difficult to get the best cluster number of k-means clustering algorithm by using the traditional internal validity indexes. Therefore, a good clustering result cannot be obtained usually. To solve this problem, this paper proposes an improved method for k-means clustering based on internal validity indexes and inter-cluster variance. Firstly, some integers distributed in the interval [2, √n] are selected as initial cluster numbers. Further, each initial cluster number is selected to carry out k-means clustering and obtain a clustering result. Secondly, two initial cluster numbers kD and kC are extracted respectively under two optimal validity index values (i.e., DB and CH). The kD and kC are extracted by comparing and analysing the validity index values. Finally, the validity index ICS-VAR is proposed to select the best cluster number kB, while kD is not equal to kC. Experimental results show that the improved method can obtain a better cluster number under a certain condition.
Keywords: k-means clustering algorithm; internal validity indexes; inter-cluster variance; initial cluster number.
DOI: 10.1504/IJCSE.2022.123112
International Journal of Computational Science and Engineering, 2022 Vol.25 No.3, pp.253 - 261
Received: 01 Mar 2021
Accepted: 26 May 2021
Published online: 30 May 2022 *