Title: Memory-efficient detection of large-scale obfuscated malware
Authors: Yueming Wang; Meng Zhang
Addresses: College of Computer Science and Technology, Jilin University, Changchun, Jilin, China ' College of Computer Science and Technology, Jilin University, Changchun, Jilin, China
Abstract: Obfuscation techniques are frequently used in malicious programs to evade detection. However, current effective methods often require much memory space during training. This paper proposes a machine-learning-based solution to the malware detection problem that consumes fewer memory resources. We use hash and sparse matrix to build a text bag of words to reduce memory usage during training. Experiments show that our approach reduces the memory footprint by 95% when using 110,000 text data for confusion recognition training compared to the existing model. In the de-obfuscation step, our method improves the recognition accuracy of the import table function by 40%. Our model achieves shallow memory usage during confusion recognition training and enhances the accuracy of imported table recognition. Additionally, the confusion recognition accuracy is only about 10% lower than the confusion recognition model before the improvement.
Keywords: malware; Naïve Bayes; algorithm.
DOI: 10.1504/IJWMC.2024.136586
International Journal of Wireless and Mobile Computing, 2024 Vol.26 No.1, pp.48 - 60
Received: 08 Nov 2022
Accepted: 22 Feb 2023
Published online: 07 Feb 2024 *