Scalable Learnability Measure for Hierarchical Learning in Large Scale Multi-Class Classification

Raphael Puget 1 Nicolas Baskiotis 1 Patrick Gallinari 1
1 MLIA - Machine Learning and Information Access
LIP6 - Laboratoire d'Informatique de Paris 6
Abstract : The increase in computational and storage capacities leads to an increasing complexity of the data to be treated: data can be represented in much more detail (many features) and in very large amounts : in the context of text categorization or image classification, the number of labels can scale from $10^2$ to $10^5$, and features range from $10^4$ to $10^6$. The main trade-off is generally between the accuracy of the predictions and the inference time. A usual methodology consists in organizing multiple classifiers in a hierarchical structure in order to reduce the computation cost of the inference. A popular category of algorithms is to iteratively build the structure. Inspired by clustering, the iteration scheme is a splitting (top-down lgorithms) or aggregating (bottom-up algorithms) process. This step uses measures to determine the split/aggregation rule (like entropy, similarity between classes, separability ...). These kinds of measures are often computationaly heavy and can not be used in a large scale context. In this paper, we propose to use a reduced projected space of the input space to build measures of interest. Preliminary experiments on real dataset show the interest of such methods. We propose preliminary experiments which integrate a ''learnability'' measure in hierarchical approaches.
Type de document :
Communication dans un congrès
WSDM Workshop Web-Scale Classification: Classifying Big Data from the Web, 2014, New York, United States. 2014
Liste complète des métadonnées

https://hal.archives-ouvertes.fr/hal-01068413
Contributeur : Raphael Puget <>
Soumis le : vendredi 26 septembre 2014 - 14:25:40
Dernière modification le : jeudi 22 novembre 2018 - 14:27:15
Document(s) archivé(s) le : samedi 27 décembre 2014 - 10:15:17

Fichier

wsdm_ws.pdf
Fichiers produits par l'(les) auteur(s)

Identifiants

  • HAL Id : hal-01068413, version 1

Collections

Citation

Raphael Puget, Nicolas Baskiotis, Patrick Gallinari. Scalable Learnability Measure for Hierarchical Learning in Large Scale Multi-Class Classification. WSDM Workshop Web-Scale Classification: Classifying Big Data from the Web, 2014, New York, United States. 2014. 〈hal-01068413〉

Partager

Métriques

Consultations de la notice

278

Téléchargements de fichiers

194