Title: Distributed computing and shared memory-based utility list buffer miner with parallel frameworks for high utility itemset mining
Authors: Eduardus Hardika Sandy Atmaja; Kavita Sonawane
Addresses: Department of Computer Engineering, St. Francis Institute of Technology, Mumbai, India; Department of Informatics, Sanata Dharma University, Yogyakarta, Indonesia ' Department of Computer Engineering, St. Francis Institute of Technology, Mumbai, India
Abstract: High utility itemset mining (HUIM) is a well-known pattern mining technique. It considers the utility of the items that leads to finding high profit patterns which are more useful for real conditions. Handling large and complex dataset are the major challenges in HUIM. The main problem here is the exponential time complexity. Literature review shows multi-core approaches to solve this problem by parallelising the tasks, but it is limited to single machine resources and also needs a novel strategy. To address this problem, we proposed new strategies namely distributed computing (DC-PLB) and shared memory (SM-PLB)-based utility list buffer miner with parallel frameworks (PLB). It utilises cluster nodes to parallelise and distribute the tasks efficiently. Thorough experiments with results proved that the proposed frameworks achieved better runtime (448 s) in dense datasets compared to the existing PLB (2,237 s). It has effectively addressed the challenges of handing large and complex datasets.
Keywords: high utility itemset mining; HUIM; PLB; DC-PLB; SM-PLB; cluster computing; parallel and distributed computing; data mining; MPI; Apache Spark.
DOI: 10.1504/IJBIDM.2023.132580
International Journal of Business Intelligence and Data Mining, 2023 Vol.23 No.2, pp.125 - 149
Received: 04 Nov 2021
Accepted: 27 Feb 2022
Published online: 30 Jul 2023 *