Textometric Exploitation of Coreference-annotated Corpora with TXM: Methodological Choices and First Outcomes

Abstract : In this article we present a set of measures – some of which can lead to specific visualisations – with the objective to enrich the possibilities of exploration and exploitation of annotated data, and in particular coreference chains. We first present a specific use of the well-known concordancer, which is here adapted to present the elements of a coreference chain. We then present a histogram generator that allows for example to display the distribution of the various coreference chains of a text, given a value from the annotated properties. Finally, we present what we call progress diagrams, whose purpose is to display the progress of each chain throughout the text. We conclude on the interest of these (interactive) modes of visualization in order to make the annotation phase more controlled and more effective.
Liste complète des métadonnées

Cited literature [9 references]  Display  Hide  Download

https://hal.archives-ouvertes.fr/hal-01814858
Contributor : Frédéric Landragin <>
Submitted on : Wednesday, June 13, 2018 - 3:55:35 PM
Last modification on : Thursday, February 7, 2019 - 4:53:17 PM
Document(s) archivé(s) le : Monday, September 17, 2018 - 11:36:04 AM

Identifiers

  • HAL Id : hal-01814858, version 1

Citation

Matthieu Quignard, Serge Heiden, Frédéric Landragin, Matthieu Decorde. Textometric Exploitation of Coreference-annotated Corpora with TXM: Methodological Choices and First Outcomes. Fourteenth International Conference on the Statistical Analysis of Textual Data, Jun 2018, Rome, Italy. pp.610-615. ⟨hal-01814858⟩

Share

Metrics

Record views

130

Files downloads

67