Disambiguation of occurrences of reformulation markers c'est-à-dire, disons, ça veut dire

Abstract : Reformulation is a process which consists of saying again an utterance which has already been said, but which goes through formal and/or semantic modifications. Sometimes, reformulations are signaled by specific markers, such as c'est-à-dire, disons, ça veut dire. We propose to study the reformulation phenomenon. More particularly, we concentrate on the syntagmatic structure S1 marker S2, coined around the reformulation markers, and in which the first segment S1 is reformulated by the second segment S2. The purpose of our study is to automatically differentiate between reformulation and non-reformulation occurrences of the markers studied. We design a rule-based system which relies on a set of rules to make the decision. Two kinds of French corpora are processed: spoken corpora ESLO and forum discussion corpus. The evaluation of the system is performed against the manually annotated and consensual reference data. Our system has been created on a subset of the spoken corpus and then applied to the rest of the data. The results obtained reach up to 0.75 precision and are comparable on the corpora analyzed, although spoken corpora remain more difficult to process.
Liste complète des métadonnées

https://hal.archives-ouvertes.fr/hal-01426808
Contributor : Natalia Grabar <>
Submitted on : Wednesday, January 4, 2017 - 10:26:52 PM
Last modification on : Tuesday, July 3, 2018 - 11:47:20 AM
Document(s) archivé(s) le : Wednesday, April 5, 2017 - 3:31:22 PM

File

grabar-JADT2016.pdf
Files produced by the author(s)

Identifiers

  • HAL Id : hal-01426808, version 1

Collections

Citation

Natalia Grabar, Iris Eshkol-Taravella. Disambiguation of occurrences of reformulation markers c'est-à-dire, disons, ça veut dire. JADT 2016, Jun 2016, Nice, France. 〈hal-01426808〉

Share

Metrics

Record views

199

Files downloads

129