Title: Detecting data inconsistencies by multiple target rules
Authors: Kalaivany Natarajan; Jiuyong Li; Andy Koronios
Addresses: School of Computer and Information Science, University of South Australia, Mawson Lakes, South Australia 5095, Australia. ' School of Computer and Information Science, University of South Australia, Mawson Lakes, South Australia 5095, Australia. ' School of Computer and Information Science, University of South Australia, Mawson Lakes, South Australia 5095, Australia
Abstract: Data quality problems are common in large databases. One main data quality problem is data inconsistencies. Data mining techniques can be used to predict inconsistent values. One of the main techniques is association rule mining. Association rules identify relationships between attribute values and can be used to find out inconsistent values. In this paper, we use multiple target rules to identify inconsistent values. Multiple target rules are an extension of association rules and use a set of disjunctive attribute values as consequences. Traditional association rules predict inconsistent values by single or multiple conjunctive RHS rules. The coverage of traditional association rules is limited because of the high confidence requirement. We propose to extend RHS to multiple disjunctive rules. The coverage of multiple disjunctive rules has been extended. Prediction power of multiple disjunctive rules is higher than the traditional association rules.
Keywords: data cleaning; data mining; association rules; multiple target rules; data inconsistency; data quality; rule mining.
DOI: 10.1504/IJBSR.2012.047928
International Journal of Business and Systems Research, 2012 Vol.6 No.3, pp.296 - 312
Published online: 14 Nov 2014 *
Full-text access for editors Full-text access for subscribers Purchase this article Comment on this article