Syntactic parsing of chat language in contact center conversation corpus

Abstract : Chat language is often referred to as Computer-mediated communication (CMC). Most of the previous studies on chat language has been dedicated to collecting " chat room " data as it is the kind of data which is the most accessible on the WEB. This kind of data falls under the informal register whereas we are interested in this paper in understanding the mechanisms of a more formal kind of CMC: dialog chat in contact centers. The particularities of this type of dialogs and the type of language used by customers and agents is the focus of this paper towards understanding this new kind of CMC data. The challenges for processing chat data comes from the fact that Natural Language Processing tools such as syntactic parsers and part of speech taggers are typically trained on mismatched conditions, we describe in this study the impact of such a mismatch for a syntactic parsing task.
Complete list of metadatas
Contributor : Alexis Nasr <>
Submitted on : Wednesday, February 15, 2017 - 5:25:22 PM
Last modification on : Monday, March 4, 2019 - 2:04:14 PM
Long-term archiving on : Tuesday, May 16, 2017 - 12:11:42 PM


Publisher files allowed on an open archive






Alexis Nasr, Geraldine Damnati, Aleksandra Guerraz, Frederic Bechet. Syntactic parsing of chat language in contact center conversation corpus. Annual SIGdial Meeting on Discourse and Dialogue, Sep 2016, Los Angeles, United States. pp.175 - 184, ⟨10.18653/v1/W16-3621⟩. ⟨hal-01454768⟩



Record views


Files downloads