Strategies to select examples for Active Learning with Conditional Random Fields

Vincent Claveau 1 Ewa Kijak 1
1 LinkMedia - Creating and exploiting explicit links between multimedia fragments
Inria Rennes – Bretagne Atlantique , IRISA_D6 - MEDIA ET INTERACTIONS
Abstract : Nowadays, many NLP problems are tackled as supervised machine learning tasks. Consequently, the cost of the expertise needed to annotate the examples is a widespread issue. Active learning offers a framework to that issue, allowing to control the annotation cost while maximizing the classifier performance, but it relies on the key step of choosing which example will be proposed to the expert. In this paper, we examine and propose such selection strategies in the specific case of Conditional Random Fields (CRF) which are largely used in NLP. On the one hand, we propose a simple method to correct a bias of some state-of-the-art selection techniques. On the other hand, we detail an original approach to select the examples, based on the respect of proportions in the datasets. These contributions are validated over a large range of experiments implying several datasets and tasks, including named entity recognition, chunking, phonetization, word sense disambiguation.
Type de document :
Communication dans un congrès
CICLing 2017 - 18th International Conference on Computational Linguistics and Intelligent Text Processing, Apr 2017, Budapest, Hungary. pp.1-14
Liste complète des métadonnées

Littérature citée [29 références]  Voir  Masquer  Télécharger

https://hal.archives-ouvertes.fr/hal-01621338
Contributeur : Vincent Claveau <>
Soumis le : lundi 23 octobre 2017 - 11:44:07
Dernière modification le : mercredi 16 mai 2018 - 11:24:14
Document(s) archivé(s) le : mercredi 24 janvier 2018 - 13:40:20

Fichier

Claveau_CICling2017.pdf
Fichiers produits par l'(les) auteur(s)

Identifiants

  • HAL Id : hal-01621338, version 1

Citation

Vincent Claveau, Ewa Kijak. Strategies to select examples for Active Learning with Conditional Random Fields. CICLing 2017 - 18th International Conference on Computational Linguistics and Intelligent Text Processing, Apr 2017, Budapest, Hungary. pp.1-14. 〈hal-01621338〉

Partager

Métriques

Consultations de la notice

276

Téléchargements de fichiers

126