CRAFTML, an Efficient Clustering-based Random Forest for Extreme Multi-label Learning - Archive ouverte HAL Accéder directement au contenu
Communication Dans Un Congrès Année : 2018

CRAFTML, an Efficient Clustering-based Random Forest for Extreme Multi-label Learning

Pascale Kuntz
  • Fonction : Auteur
  • PersonId : 853783
  • IdRef : 103904638
Frank Meyer
  • Fonction : Auteur
  • PersonId : 953045

Résumé

Extreme Multi-label Learning (XML) considers large sets of items described by a number of labels that can exceed one million. Tree-based methods, which hierarchically partition the problem into small scale sub-problems, are particularly promising in this context to reduce the learn-ing/prediction complexity and to open the way to parallelization. However, the current best approaches do not exploit tree randomization which has shown its efficiency in random forests and they resort to complex partitioning strategies. To overcome these limits, we here introduce a new random forest based algorithm with a very fast partitioning approach called CRAFTML. Experimental comparisons on nine datasets from the XML literature show that it outperforms the other tree-based approaches. Moreover with a paral-lelized implementation reduced to five cores, it is competitive with the best state-of-the-art methods which run on one hundred-core machines.
Fichier principal
Vignette du fichier
paper.pdf (708.06 Ko) Télécharger le fichier
Origine : Fichiers produits par l'(les) auteur(s)
Loading...

Dates et versions

hal-01982945 , version 1 (16-01-2019)

Identifiants

  • HAL Id : hal-01982945 , version 1

Citer

Wissam Siblini, Pascale Kuntz, Frank Meyer. CRAFTML, an Efficient Clustering-based Random Forest for Extreme Multi-label Learning. The 35th International Conference on Machine Learning. (ICML 2018), Jul 2018, Stockholm, Sweden. ⟨hal-01982945⟩
169 Consultations
84 Téléchargements

Partager

Gmail Facebook X LinkedIn More