Title: A hybrid algorithm for mining local outliers in categorical data
Authors: Meiling Liu; Mingxuan Huang; Weidong Tang
Addresses: College of Software and Information Security, Guangxi University for Nationalities, Nanning 530006, China; Science Computing and Intelligent Information Processing of Guangxi Higher Education Key Laboratory, Nanning 530023, China ' College of Information and Statistics, Guangxi University of Finance and Economics, Nanning 530003, China ' College of Information Science and Engineering, Guangxi University for Nationalities, Nanning 530006, China
Abstract: Outlier detection is an important task in data mining. Many approaches have been developed to detect outliers. However, most researches focus on global outlier detection. In many situations, the local outlier detection is more valuable than the global outlier detection. In this paper, the existing methods for outlier detection are discussed firstly, and then the definition of local outlier and some formulas are given. Also a hybrid algorithm for mining local outlier is proposed which is based on clustering algorithm and standard deviation in statistics. By calculating the standard deviation of a cluster and local outlier factor of an object in the cluster, we can identify that the clusters with higher standard deviation may have outliers, and the objects with higher local outlier factor can be recognised as outliers. Experimental results on real datasets show that the proposed algorithm is correct and effective for mining local outliers.
Keywords: local outlier; standard deviation; local outlier factor; clustering; data mining.
DOI: 10.1504/IJWMC.2017.087342
International Journal of Wireless and Mobile Computing, 2017 Vol.13 No.1, pp.78 - 85
Received: 05 Aug 2016
Accepted: 16 Feb 2017
Published online: 13 Oct 2017 *