Title: Highly imbalanced classification using improved rotation forests
Authors: Xiaonan Fang; Xiyuan Zheng; Yanyan Tan; Huaxiang Zhang
Addresses: Department of Information Science and Engineering, Shandong Provincial Key Laboratory for Novel Distributed Computer Software Technology, Shandong Normal University, Jinan, China ' Department of Information Science and Engineering, Shandong Provincial Key Laboratory for Novel Distributed Computer Software Technology, Shandong Normal University, Jinan, China ' Department of Information Science and Engineering, Shandong Provincial Key Laboratory for Novel Distributed Computer Software Technology, Shandong Normal University, Jinan, China ' Department of Information Science and Engineering, Shandong Provincial Key Laboratory for Novel Distributed Computer Software Technology, Shandong Normal University, Jinan, China
Abstract: Imbalanced data classification is a challenging problem in data mining. It happens in many real-world applications and has attracted growing attentions from researchers. This issue occurs when the number of one class is much higher than the other class. Ensemble of classifiers has been well known as an effective solution. Then, two novel ensemble algorithms (RUROForest and SROForest) based on rotation forests are proposed for solving highly imbalanced problems. Random under-sampling or SMOTE approaches are combined with rotation forest in the proposed algorithms, which balance the uneven distribution of data sets and keep the diversity of single classifier as well. Focused on two-class highly imbalanced problems, 22 relevant data sets are performed in experiments. Experimental results and statistical analyses show that our proposed methods overtake the state-of-the-art ensemble methods on the most widely used imbalanced measure criterion AUC.
Keywords: ensemble learning; imbalanced data sets; rotation forests; SMOTE; random under-sampling; imbalanced classification; data mining.
DOI: 10.1504/IJWMC.2016.075233
International Journal of Wireless and Mobile Computing, 2016 Vol.10 No.1, pp.35 - 41
Received: 23 Jul 2015
Accepted: 01 Sep 2015
Published online: 08 Mar 2016 *