Title: New weighted clustering ensemble based on external index and subspace attributes partitions for large features datasets
Authors: Nadjia Khatir; Safia Nait-Bahloul
Addresses: LITIO, Department of Computer Science, Faculty of Exact and Applied Sciences, University Oran 1 Ahmed Ben Bella, Oran, Algeria ' LITIO, Department of Computer Science, Faculty of Exact and Applied Sciences, University Oran 1 Ahmed Ben Bella, Oran, Algeria
Abstract: Real world datasets are commonly large and involve a lot of features. This is due because of the variety of domains where are obtained from or for the impact of diverse features extractors techniques. Relatively few works on selecting and weighting relevant features for the propose of clustering data are involved in the literature. To cope with this issue, in this paper a new weighting partitions-based features selection framework is proposed in conjunction with clustering ensemble for large features datasets. Six real world datasets from both images and biological domains are chosen to be evaluated and an average accuracy between 75.18% and 98.04% is achieved. Results show that the new proposed technique has been successfully outclassed state-of-the-art methods in term of both effectiveness and efficiency.
Keywords: clustering ensemble; consensus; features selection; weighted partitions; multi-features data; large datasets; external index; data fusion; graph partitioning.
DOI: 10.1504/IJIEI.2019.101549
International Journal of Intelligent Engineering Informatics, 2019 Vol.7 No.4, pp.323 - 345
Received: 26 Sep 2018
Accepted: 02 Jan 2019
Published online: 12 Aug 2019 *