Skip to Main content Skip to Navigation
Journal articles

Hétérogénéité et extraction d'information factuelle dans un corpus de récits de voyage

Anaïs Lefeuvre 1 Natalia Vinogradova 2
2 SIGNES - Linguistic signs, grammar and meaning: computational logic for natural language
CNRS - Centre National de la Recherche Scientifique : UMR5800, École Nationale Supérieure d'Électronique, Informatique et Radiocommunications de Bordeaux (ENSEIRB), Inria Bordeaux - Sud-Ouest, Université Sciences et Technologies - Bordeaux 1
Abstract : The information extraction task requires a good knowledge of the object to be extracted. In this work we explore the behavior of textual sequences describing the itinerary within the travel writing. Travel novel is a specific genre that is recognized to be heterogeneous, so we analyze its heterogeneity in order to discriminate homogeneous sequences, one of which being the itinerary description. Our analysis holds on different discourse levels, it allows us to get an overview of itinerary behavior through the narration. In order to automatize the extraction of itineraries, we use different tools, each one being perfectly adapted to the discourse level in question. Our theoretical framework at the semantic representation level, the SDRT (Segmented Discourse Representation Theory), complies with such kind of analysis, as we see in the course of this work. This study makes us understand the itinerary sequences behavior, leading us to enrich our extraction method to cope with heterogeneity of the discourse units dedicated to the itinerary.
Document type :
Journal articles
Complete list of metadata

Cited literature [1 references]  Display  Hide  Download

https://hal.archives-ouvertes.fr/hal-00751871
Contributor : Anaïs Lefeuvre Connect in order to contact the contributor
Submitted on : Wednesday, November 14, 2012 - 1:53:10 PM
Last modification on : Thursday, February 11, 2021 - 2:52:01 PM
Long-term archiving on: : Friday, February 15, 2013 - 3:42:17 AM

File

HetLVHal.pdf
Files produced by the author(s)

Identifiers

  • HAL Id : hal-00751871, version 1

Collections

Citation

Anaïs Lefeuvre, Natalia Vinogradova. Hétérogénéité et extraction d'information factuelle dans un corpus de récits de voyage. Langages, Armand Colin (Larousse jusqu'en 2003), 2012, 3 (187), pp. 127-144. ⟨hal-00751871⟩

Share

Metrics

Record views

365

Files downloads

442