Tree-based ranking methods

Stéphan Clémençon; Nicolas Vayatis

Pré-Publication, Document De Travail Année : 2008

Tree-based ranking methods

(1) , (2)

1
2

Stéphan Clémençon

Fonction : Auteur
PersonId : 174491
IdHAL : stephan-clemencon
ORCID : 0000-0002-5879-9500
IdRef : 08905203X

Laboratoire Traitement et Communication de l'Information

Nicolas Vayatis

Fonction : Auteur
PersonId : 848026

Centre de Mathématiques et de Leurs Applications

Résumé

The paper investigates how recursive partitioning methods can be adapted to the bipartite ranking problem. In ranking, the pursued goal is global: based on past data, define an order on the whole input space X, so that positive instances take up the top ranks with maximum probability. The most natural way to order all instances consists of projecting the input data onto the real line through a real-valued scoring function s and use the natural order on R. The accuracy of the ordering induced by a candidate s is classically measured in terms of the ROC curve or the AUC. Here we discuss the design of tree-structured scoring functions obtained by recursively maximizing the AUC criterion. The connection with recursive piecewise linear approximation of the optimal $\roc$ curve both in the L1-sense and in the L{\infty}-sense is highlighted. A novel tree-based algorithm for ranking, called TreeRank, is proposed. Consistency results and generalization bounds of functional nature are established for this ranking method, when considering either the L1 or L{\infty} distance. We also describe committee-based learning procedures using TreeRank as a "base ranker", in order to overcome obvious drawbacks of such a top-down partitioning technique. Simulation results on artificial data are also displayed.

Mots clés

bipartite ranking decision trees pruning ROC curve AUC

Domaines

Statistiques [math.ST] Théorie [stat.TH]

Fichier principal

TreeRank_long_sub.pdf (334.58 Ko)

Origine : Fichiers produits par l'(les) auteur(s)

Nicolas Vayatis : Connectez-vous pour contacter le contributeur

https://hal.science/hal-00268068

Soumis le : lundi 8 septembre 2008-18:06:26

Dernière modification le : mardi 9 avril 2024-03:10:39

Archivage à long terme le : samedi 26 novembre 2016-00:47:27

Dates et versions

hal-00268068 , version 1 (31-03-2008)

hal-00268068 , version 2 (04-04-2008)

hal-00268068 , version 3 (05-04-2008)

hal-00268068 , version 4 (19-05-2008)

hal-00268068 , version 5 (08-09-2008)

Identifiants

HAL Id : hal-00268068 , version 5

Citer

Stéphan Clémençon, Nicolas Vayatis. Tree-based ranking methods. 2008. ⟨hal-00268068v5⟩

Exporter

BibTeX XML-TEI Dublin Core DC Terms EndNote DataCite

Collections

INSTITUT-TELECOM CNRS ENS-CACHAN PARISTECH LTCI IDS S2A ENS-PARIS-SACLAY

458 Consultations

1530 Téléchargements

Tree-based ranking methods

Résumé

Mots clés

Domaines

Dates et versions

Identifiants

Citer

Exporter

Collections

Partager