Deliverable D5.1: Report on method and language for the production of the augmented document representations - Archive ouverte HAL Accéder directement au contenu
Pré-Publication, Document De Travail Année : 2006

Deliverable D5.1: Report on method and language for the production of the augmented document representations

Résumé

This deliverable describes the specifications of the Natural Language Processing line to be developed within the Work Package 5 (WP5) in the Alvis Project. The WP5 is in charge of the NLP aspects of the Information Retrieval process. Its main objective is to enrich and normalise the crawled documents provided by the WP7 prior to their indexing. In this respect, WP5 output is a set of annotated documents, which are provided either to WP2 as input of its probabilistic model, which in turn delivers document to WP3 for indexing, or to WP6 as acquisition corpora. This deliverable presents the objectives of the WP5 process for document annotation. It introduces the basic notions of linguistics that are used in the following. The core section presents the various types of annotations that WP5 is expected to produce and describes the format in which these annotations are encoded. In addition to that, this deliverable also gives an overview of the WP5 processing line, its architecture for the various languages studied in Alvis and the various NLP components that produce annotated documents.
Fichier principal
Vignette du fichier
ALVIS_D5_1_20043112_P13_AN.pdf (782.46 Ko) Télécharger le fichier
Loading...

Dates et versions

hal-00101549 , version 1 (27-09-2006)

Identifiants

  • HAL Id : hal-00101549 , version 1

Citer

Erick Alphonse, Sophie Aubin, Julien Derivière, Thierry Hamon, Dunja Mladenic, et al.. Deliverable D5.1: Report on method and language for the production of the augmented document representations. 2006. ⟨hal-00101549⟩
178 Consultations
48 Téléchargements

Partager

Gmail Facebook X LinkedIn More