Title: A fine-grained scheduling strategy for improving the performance of parallel frequent itemsets mining
Authors: Chao-Chin Wu; Lien-Fu Lai; Liang-Tsung Huang; Syun-Sheng Jhan; Chung Lu
Addresses: Department of Computer Science and Information Engineering, National Changhua University of Education, 2 Shi Da Road, Changhua City 500, Taiwan. ' Department of Computer Science and Information Engineering, National Changhua University of Education, 2 Shi Da Road, Changhua City 500, Taiwan. ' Department of Biotechnology, MingDao University, 69 Wen-Hua Road, Peetow, Changhua 523, Taiwan. ' Department of Information Technology, Ling Tung University, 1 Ling Tung Road, Taichung City 409, Taiwan. ' Department of Computer Science and Information Engineering, National Changhua University of Education, 2 Shi Da Road, Changhua City 500, Taiwan
Abstract: We propose a scheduling strategy in this paper to address the load imbalance problem of the distributed parallel apriori (DPA) algorithm published recently. We use fine grained tasks that are derived by dividing the tasks defined by DPA into smaller subtasks. The subtasks will be scheduled by a dynamic self-scheduling scheme for better load balance. Furthermore, we propose two different methods for data transmission from the master to workers. The first one broadcasts all the frequent k-itemsets to all work nodes while the second one transmits only the required data to each individual work node. Experimental results demonstrate the proposed two approaches both outperform DPA. The first one is more suitable for small datasets and the second one provides steadier performance improvement no matter which self-scheduling scheme is adopted.
Keywords: data mining; frequent itemsets; parallel computing; distributed computing; cluster systems; dynamic scheduling; load imbalance; fine-grained scheduling; self-scheduling.
DOI: 10.1504/IJCSE.2011.043925
International Journal of Computational Science and Engineering, 2011 Vol.6 No.4, pp.264 - 274
Received: 29 Apr 2011
Accepted: 15 Jul 2011
Published online: 21 Mar 2015 *