Title: A content independent domain abuse detection method
Authors: Yang Fan; Xiang Zhengrong; Tang Shoulian
Addresses: School of Economics and Management, Beijing University of Posts and Telecommunications, Beijing, China ' School of Economics and Management, Beijing University of Posts and Telecommunications, Beijing, China ' School of Economics and Management, Beijing University of Posts and Telecommunications, Beijing, China
Abstract: This paper proposes a series of language-independent domain name abuse detection features, including domain name string features, domain name registration features, domain name resolution features and domain name service features, and trains six pattern recognition algorithms in the corresponding feature space. To validate the effectiveness of extracted features and learning algorithms, a practical data set is constructed, and the performance of related features and learning algorithms are compared and analysed. The experimental results show that the multi-scale features extracted in this paper have good recognition ability. The Random Forest algorithm achieves the best comprehensive effect when only 8-dimensional fusion features are used, where F1-Measure and ROC Area reach 0.965 and 0.978, respectively.
Keywords: domain name system; domain abuse detection; machine learning; feature extraction.
DOI: 10.1504/IJWMC.2020.105699
International Journal of Wireless and Mobile Computing, 2020 Vol.18 No.2, pp.123 - 131
Received: 01 Jun 2019
Accepted: 07 Aug 2019
Published online: 09 Mar 2020 *