Title: Decision trees for binary classification variables grow equally with the Gini impurity measure and Pearson's chi-square test
Authors: Johannes L. Grabmeier, Larry A. Lambe
Addresses: University of Applied Sciences Deggendorf, Edlmairstr. 6+8, D-94469, Deggendorf, Germany. ' Multidisciplinary Software Systems Research Corporation (MSSRC), P.O. Box 6667, Bloomingdale, IL 60108, USA
Abstract: We show that for binary classification variables, Gini and Pearson purity measures yield exactly the same tree, provided all the other parameters of the algorithms are identical. A counter-example for ternary classification variables is given.
Keywords: decision trees; Gini; impurity measure; Pearson; chi-square test; entropy; binary classification; variables; contingency matrix; power series expansion; symmetric polynomials; purity measures; ternary classification; data mining.
DOI: 10.1504/IJBIDM.2007.013938
International Journal of Business Intelligence and Data Mining, 2007 Vol.2 No.2, pp.213 - 226
Published online: 04 Jun 2007 *
Full-text access for editors Full-text access for subscribers Purchase this article Comment on this article