Malicious URL detection with feature extraction based on machine learning Online publication date: Fri, 31-Aug-2018
by Baojiang Cui; Shanshan He; Xi Yao; Peilin Shi
International Journal of High Performance Computing and Networking (IJHPCN), Vol. 12, No. 2, 2018
Abstract: Many web applications suffer from various web attacks due to the lack of awareness concerning security. Therefore, it is necessary to improve the reliability of web applications by accurately detecting malicious URLs. In previous studies, keyword matching has always been used to detect malicious URLs, but this method is not adaptive. In this paper, statistical analyses based on gradient learning and feature extraction using a sigmoidal threshold level are combined to propose a new detection approach based on machine learning techniques. Moreover, the naïve Bayes, decision tree and SVM classifiers are used to validate the accuracy and efficiency of this method. Finally, the experimental results demonstrate that this method has a good detection performance, with an accuracy rate above 98.7%. In practical use, this system has been deployed online and is being used in large-scale detection, analysing approximately 2 TB of data every day.
Existing subscribers:
Go to Inderscience Online Journals to access the Full Text of this article.
If you are not a subscriber and you just want to read the full contents of this article, buy online access here.Complimentary Subscribers, Editors or Members of the Editorial Board of the International Journal of High Performance Computing and Networking (IJHPCN):
Login with your Inderscience username and password:
Want to subscribe?
A subscription gives you complete access to all articles in the current issue, as well as to all articles in the previous three years (where applicable). See our Orders page to subscribe.
If you still need assistance, please email subs@inderscience.com