Distributed and multi-core version of k-means algorithm Online publication date: Mon, 20-May-2019
by Ilias K. Savvas; Dimitrios Tselios; Georgia Garani
International Journal of Grid and Utility Computing (IJGUC), Vol. 10, No. 3, 2019
Abstract: Nowadays, huge quantities of data are generated by billions of machines and devices. Numerous methods have been employed, in order to make use of this valuable resource, some of them are altered versions of established known algorithms. One of the most seminal methods, in order to mine from data sources, is clustering, and k-means is a key algorithm which forms clusters of data according to a set of attributes. However, its main shortcoming is the high computational complexity which proves the k-means is very inefficient to perform on big data sets. Although k-means is a very well utilised algorithm, a functional distributed variant combining the multi-core power of contemporary machines has not been accepted yet by researchers. In this work, a three phase distributed/multi-core version of k-means and the analysis of its results are presented. The obtained experimental results are in line with the theoretical outcomes and prove the correctness, efficiency, and scalability of the proposed technique.
Existing subscribers:
Go to Inderscience Online Journals to access the Full Text of this article.
If you are not a subscriber and you just want to read the full contents of this article, buy online access here.Complimentary Subscribers, Editors or Members of the Editorial Board of the International Journal of Grid and Utility Computing (IJGUC):
Login with your Inderscience username and password:
Want to subscribe?
A subscription gives you complete access to all articles in the current issue, as well as to all articles in the previous three years (where applicable). See our Orders page to subscribe.
If you still need assistance, please email subs@inderscience.com