Combining Explicitness and Classifying Performance via MIDOVA Lossless Representation for Qualitative Datasets
Résumé
Basically, MIDOVA lists the relevant combinations of K boolean variables, thus giving rise to an appropriate expansion of the original set of variables, well-fitted to for a number of data mining tasks. MIDOVA takes into account the presence as well as the absence of items. The building of level-k itemsets starting from level-k-1 ones relies on the concept of residue, which entails the potential of an itemset to create higher-order non-trivial associations. We assess the value of such a representation by presenting an application to three well-known classification tasks: the resulting success proves that our objective of extracting the relevant interactions hidden in the data, and only these ones, has been hit.