Semantic Frame Parsing for Information Extraction : the CALOR corpus

Abstract : This paper presents a publicly available corpus of French encyclopedic history texts annotated according to the Berkeley FrameNet formalism. The main difference in our approach compared to previous works on semantic parsing with FrameNet is that we are not interested here in full text parsing but rather on partial parsing. The goal is to select from the FrameNet resources the minimal set of frames that are going to be useful for the applicative framework targeted, in our case Information Extraction from encyclopedic documents. Such an approach leverages the manual annotation of larger corpora than those obtained through full text parsing and therefore opens the door to alternative methods for Frame parsing than those used so far on the FrameNet 1.5 benchmark corpus. The approaches compared in this study rely on an integrated sequence labeling model which jointly optimizes frame identification and semantic role segmentation and identification. The models compared are CRFs and multitasks bi-LSTMs.
Type de document :
Communication dans un congrès
LREC2018, May 2018, Miyazaki, Japan
Liste complète des métadonnées

https://hal.archives-ouvertes.fr/hal-01959187
Contributeur : Gabriel Marzinotto <>
Soumis le : mardi 18 décembre 2018 - 15:14:37
Dernière modification le : jeudi 20 décembre 2018 - 16:10:03
Document(s) archivé(s) le : mercredi 20 mars 2019 - 09:47:39

Fichiers

Semantic Frame Parsing for Inf...
Fichiers produits par l'(les) auteur(s)

Identifiants

  • HAL Id : hal-01959187, version 1
  • ARXIV : 1812.08039

Collections

Citation

Gabriel Marzinotto, Jeremy Auguste, Frederic Bechet, Géraldine Damnati, Alexis Nasr. Semantic Frame Parsing for Information Extraction : the CALOR corpus. LREC2018, May 2018, Miyazaki, Japan. 〈hal-01959187〉

Partager

Métriques

Consultations de la notice

55

Téléchargements de fichiers

14