Title: Optimisation of plagiarism detection using vector space model on CUDA architecture
Authors: Jiffriya Mohamed Abdul Cader; Akmal Jahan Mohamed Abdul Cader; Hasindu Gamaarachchi; Roshan G. Ragel
Addresses: Department of Information Technology, Sri Lanka Institute of Advanced Technological Education Sammanthurai, Sri Lanka ' Faculty of Applied Sciences, South Eastern University of Sri Lanka, Sri Lanka ' School of Computer Science and Engineering, University of New South Wales, Australia ' Faculty of Engineering, University of Peradeniya, Sri Lanka
Abstract: Plagiarism is a rapidly rising issue among students during submission of assignments, reports and publications in universities and educational institutions, due to easy accessibility of abundant e-resources on the internet. Existing tools become inefficient in terms of time consumption when dealing with the prolific number of documents with large content. Therefore, we have focused on software-based acceleration for plagiarism detection using CPU/GPU. Initially serial version of vector space model was implemented on CPU and tested with 1,000 documents, which consumed 1,641 s. As processing time was a bottleneck of performance, we indented to develop parallel version of the model on the graphics processing units (GPUs) using compute unified device architecture (CUDA) and tested with the same dataset which consumed only 36 s and gained 45x speed up compared to the CPU. Then the version was optimised further and took only 4 s for the same dataset which was 389x faster than the serial implementation.
Keywords: graphics processing units; GPUs; compute unified device architecture; CUDA; plagiarism detection; vector space model; CPU; VSM; parallel computing; speed up; acceleration; idf; web-based commercial tool; kernel; Google Cloud.
DOI: 10.1504/IJICA.2022.125675
International Journal of Innovative Computing and Applications, 2022 Vol.13 No.4, pp.232 - 244
Received: 23 Nov 2020
Accepted: 16 Mar 2021
Published online: 26 Sep 2022 *