Title: Effective feature selection based on MANOVA
Authors: Trong-Kha Nguyen; Vu Duc Ly; Seong Oun Hwang
Addresses: Department of Electronics and Computer Engineering, Hongik University, Sejong, South Korea ' Department of Electronics and Computer Engineering, Hongik University, Sejong, South Korea ' Department of Software and Communications Engineering, Hongik University, Sejong, South Korea
Abstract: Effectiveness in classifying malware is a critical issue which can overheat a classifier or reduce performance in real-time malware detection systems. However, the effectiveness in feature selection stage was not studied so far. As effectiveness should be taken into account at the earliest possible stages, in this paper, we focus on the effectiveness of feature selection. Firstly, we perform an analysis on instruction levels which consists of most frequencies mnemonics. Secondly, we propose new methods to select effective features by MANOVA statistical tests. Furthermore, we use those selected features fed to a classifier. Our approach reduces significantly the number of features from 390 to 4, which explains 99.4% variation of the data. With the selected features, we classify malware samples and have achieved 96.2% of accuracy and 0.6% of false positive.
Keywords: malware classification; statistical analysis; security.
DOI: 10.1504/IJITST.2020.108133
International Journal of Internet Technology and Secured Transactions, 2020 Vol.10 No.4, pp.383 - 395
Received: 07 Apr 2018
Accepted: 17 Nov 2018
Published online: 03 Jul 2020 *