Concept Discovery and Automatic Semantic Annotation for Language Understanding in an Information-Query Dialogue System Using Latent Dirichlet Allocation and Segmental Methods

Abstract : Efficient statistical approaches have been recently proposed for natural language understanding in the context of dialogue systems. However, these approaches are trained on data semantically annotated at the segmental level, which increases the production cost of these resources. This kind of semantic annotation implies both to determine the concepts in a sentence and to link them to their corresponding word segments. In this paper, we propose a two-step automatic method for semantic annotation. The first step is an implementation of the latent Dirichlet allocation aiming at discovering concepts in a dialogue corpus. Then this knowledge is used as a bootstrap to infer automatically a segmentation of a word sequence into concepts using either integer linear optimisation or stochastic word alignment models (IBM models). The relation between automatically-derived and manually-defined task-dependent concepts is evaluated on a spoken dialogue task with a reference annotation.
Complete list of metadatas

Cited literature [4 references]  Display  Hide  Download

https://hal.archives-ouvertes.fr/hal-01314550
Contributor : Bibliothèque Universitaire Déposants Hal-Avignon <>
Submitted on : Wednesday, February 27, 2019 - 10:27:38 AM
Last modification on : Wednesday, May 15, 2019 - 10:12:03 AM
Long-term archiving on : Tuesday, May 28, 2019 - 1:09:18 PM

File

chapter-Knowledge-Discovery-Kn...
Files produced by the author(s)

Identifiers

Citation

Nathalie Camelin, Boris Detienne, Stéphane Huet, Dominique Quadri, Fabrice Lefèvre. Concept Discovery and Automatic Semantic Annotation for Language Understanding in an Information-Query Dialogue System Using Latent Dirichlet Allocation and Segmental Methods. Fred A., Dietz J.L.G., Liu K., Filipe J. Knowledge Discovery, Knowledge Engineering and Knowledge Management, 348, Springer, pp.45-59, 2013, Communications in Computer and Information Science, 978-3-642-37185-1. ⟨10.1007/978-3-642-37186-8_3⟩. ⟨hal-01314550⟩

Share

Metrics

Record views

87

Files downloads

8