Title: Credit card fraud detection: an evaluation of SMOTE resampling and machine learning model performance

Authors: Faleh Alshameri; Ran Xia

Addresses: Department of Information Technology, Data Science, and Cybersecurity, School of Business and Technology, Marymount University, USA ' Department of Information Technology, Data Science, and Cybersecurity, School of Business and Technology, Marymount University, USA

Abstract: Credit card fraud has been a noted security issue that requires financial organisations to continuously improve their fraud detection system. In most cases, a credit transaction dataset is expected to have a significantly larger number of normal transactions than fraud transactions. Therefore, the accuracy of a fraud detection system depends on building a model that can adequately handle such an imbalanced dataset. The purpose of this paper is to explore one of the techniques of dataset rebalancing, the synthetic minority oversampling technique (SMOTE). To evaluate the effects of this technique on model training, we selected four basic classification algorithms, complement naïve Bayes (CNB), K-nearest neighbour (KNN), random forest and support vector machine (SVM). We then compared the performances of the four models trained on the rebalanced and original dataset using the area under precision-recall curve (AUPRC) plots.

Keywords: credit card; imbalanced dataset; resampling method; synthetic minority oversampling technique; SMOTE; AUPRC; classification algorithms.

DOI: 10.1504/IJBIDM.2023.131791

International Journal of Business Intelligence and Data Mining, 2023 Vol.23 No.1, pp.1 - 13

Received: 12 Aug 2021
Accepted: 30 Nov 2021

Published online: 03 Jul 2023 *

Full-text access for editors Full-text access for subscribers Purchase this article Comment on this article