Open Access Article

Title: Data Mining Methods in the Detection of Spam

Authors: Dong-Her Shih; Hsiu-Sen Chiang; Chia-Shyang Lin; Ming-Hung Shih

Addresses: Author address listing can be found in the "About the Authors" section at the end of the article.

Abstract: The spam problem has generated enormous costs for companies and users of the Internet. Internet users not only pay for the bandwidth to bring in volumes of spam mail but also pay for its storage. In this paper, we propose a modified Naïve Bayesian classifier and compare it with three data mining methods for identifying whether incoming mail is spam or legitimate automatically. The experimental results show that although there is no dominant algorithm to the spam problem, generally the decision tree has the better performance. Our proposed modified Naïve Bayesian classifier has the potential for further investigation as well.

Keywords: Spam detection; data mining; Naive Bayesian classifier; decision tree.

DOI: 10.1504/JBM.2008.141163

Journal of Business and Management, 2008 Vol.14 No.2, pp.117 - 129

Published online: 05 Sep 2024 *