Inducing and evaluating classification trees with statistical implicative criteria
Résumé
Implicative statistics criteria have proven to be valuable interestingness measures for association rules. Here we highlight their interest for classification trees. We start by showing how Gras' implication index may be defined for rules derived from an induced decision tree. This index is especially helpful when the aim is not classification itself, but characterizing the most typical conditions of a given conclusion. We show that the index looks like a standardized residual and propose as alternatives other forms of residuals borrowed from the modeling of contingency tables. We then consider two main usages of these indexes. The first is purely descriptive and concerns the a posteriori individual evaluation of the classification rules. The second usage relies upon the strength of implication for assigning the most appropriate conclusion to each leaf of the induced tree. We demonstrate the practical usefulness of this statistical implicative view on decision trees through a full scale real world application.