Title: Memory-efficient detection of large-scale obfuscated malware

Authors: Yueming Wang; Meng Zhang

Addresses: College of Computer Science and Technology, Jilin University, Changchun, Jilin, China ' College of Computer Science and Technology, Jilin University, Changchun, Jilin, China

Abstract: Obfuscation techniques are frequently used in malicious programs to evade detection. However, current effective methods often require much memory space during training. This paper proposes a machine-learning-based solution to the malware detection problem that consumes fewer memory resources. We use hash and sparse matrix to build a text bag of words to reduce memory usage during training. Experiments show that our approach reduces the memory footprint by 95% when using 110,000 text data for confusion recognition training compared to the existing model. In the de-obfuscation step, our method improves the recognition accuracy of the import table function by 40%. Our model achieves shallow memory usage during confusion recognition training and enhances the accuracy of imported table recognition. Additionally, the confusion recognition accuracy is only about 10% lower than the confusion recognition model before the improvement.

Keywords: malware; Naïve Bayes; algorithm.

DOI: 10.1504/IJWMC.2024.136586

International Journal of Wireless and Mobile Computing, 2024 Vol.26 No.1, pp.48 - 60

Received: 08 Nov 2022
Accepted: 22 Feb 2023

Published online: 07 Feb 2024 *

Full-text access for editors Full-text access for subscribers Purchase this article Comment on this article