Legal NERC with ontologies, Wikipedia and curriculum learning

Cristian Cardellino 1 Milagro Teruel 1 Laura Alemany 1 Serena Villata 2
2 WIMMICS - Web-Instrumented Man-Machine Interactions, Communities and Semantics
CRISAM - Inria Sophia Antipolis - Méditerranée , SPARKS - Scalable and Pervasive softwARe and Knowledge Systems
Abstract : In this paper, we present a Wikipedia-based approach to develop resources for the legal domain. We establish a mapping between a legal domain ontology, LKIF (Hoekstra et al., 2007), and a Wikipedia-based ontology, YAGO (Suchanek et al., 2007), and through that we populate LKIF. Moreover, we use the mentions of those entities in Wikipedia text to train a specific Named Entity Recognizer and Classifier. We find that this classifier works well in the Wikipedia, but, as could be expected, performance decreases in a corpus of judgments of the European Court of Human Rights. However, this tool will be used as a preprocess for human annotation. We resort to a technique called curriculum learning aimed to overcome problems of overfitting by learning increasingly more complex concepts. However, we find that in this particular setting, the method works best by learning from most specific to most general concepts, not the other way round.
Type de document :
Communication dans un congrès
15th European Chapter of the Association for Computational Linguistics (EACL 2017), 2017, Valencia, Spain. pp.254 - 259, 2017, 〈10.18653/v1/E17-2041〉
Liste complète des métadonnées

https://hal.archives-ouvertes.fr/hal-01572444
Contributeur : Serena Villata <>
Soumis le : lundi 7 août 2017 - 13:53:46
Dernière modification le : mercredi 13 décembre 2017 - 10:16:05

Fichier

EACL2017.pdf
Fichiers produits par l'(les) auteur(s)

Identifiants

Collections

Citation

Cristian Cardellino, Milagro Teruel, Laura Alemany, Serena Villata. Legal NERC with ontologies, Wikipedia and curriculum learning. 15th European Chapter of the Association for Computational Linguistics (EACL 2017), 2017, Valencia, Spain. pp.254 - 259, 2017, 〈10.18653/v1/E17-2041〉. 〈hal-01572444〉

Partager

Métriques

Consultations de la notice

295

Téléchargements de fichiers

24