Detection of phishing websites using data mining tools and techniques Online publication date: Mon, 23-May-2022
by Mansi Somani; Mamatha Balachandra
International Journal of Advanced Intelligence Paradigms (IJAIP), Vol. 22, No. 1/2, 2022
Abstract: Phishing, a prevailing cyber-security issue, is one of the most common attacks to obtain user's sensitive information. To eradicate it, the users or software should detect it first. A popular approach to carry out phishing is through generating phishing URLs. A URL could be legitimate or phishy which fits phishing into a perfect classification-type problem in data mining. Hence, data mining algorithms - C4.5 (J48), SVM, Random Forest, Treebag and GBM have been trained to carry out a comparison on measures - accuracy, recall and precision to determine the most suited model. Rules have been listed that categories the features which make a website phishy or legitimate. Work has been done using R language on RStudio. The dataset used comprises of 11,055 tuples and 31 attributes. It is trained, tested and used for detection. Among the five classifiers used, the best accuracy is obtained through Random Forest model which is 97.21%.
Existing subscribers:
Go to Inderscience Online Journals to access the Full Text of this article.
If you are not a subscriber and you just want to read the full contents of this article, buy online access here.Complimentary Subscribers, Editors or Members of the Editorial Board of the International Journal of Advanced Intelligence Paradigms (IJAIP):
Login with your Inderscience username and password:
Want to subscribe?
A subscription gives you complete access to all articles in the current issue, as well as to all articles in the previous three years (where applicable). See our Orders page to subscribe.
If you still need assistance, please email subs@inderscience.com