Sentence boundary detection for transcribed Tunisian Arabic - Archive ouverte HAL Accéder directement au contenu
Communication Dans Un Congrès Année : 2016

Sentence boundary detection for transcribed Tunisian Arabic

Résumé

We study, in this paper, the problem of detecting the sentence boundary in tran-scribed spoken Tunisian Arabic. We compare and contrast three different methods for detecting sentence bounda-ries in transcribed speech. The first method uses a set of handmade contex-tual patterns for identifying the limit of sentences. The second method aims to classify transcriptions words into four classes according to their position in a sentence. Both methods are based only on lexical and some prosodic information such as silent and filled pauses. Finally, we develop two techniques for mixing the results of the two proposed methods. We show that sentence boundary detec-tion system can improve the accuracy of a POS tagger system developed for tag-ging transcribed Tunisian Arabic.
Fichier non déposé

Dates et versions

hal-01462133 , version 1 (08-02-2017)

Identifiants

  • HAL Id : hal-01462133 , version 1

Citer

Inès Zribi, Inès Kammoun, Mariem Ellouze, Lamia Hadrich Belguith, Philippe Blache. Sentence boundary detection for transcribed Tunisian Arabic. Konvens-2016, RUHR-UNIVERSITAT BOCHUM, 2016, Bochum, Germany. ⟨hal-01462133⟩
130 Consultations
0 Téléchargements

Partager

Gmail Facebook X LinkedIn More