Segmentation tool for hadith corpus to generate TEI encoding

Abstract : A segmentation tool for a hadith corpus is necessary to prepare the TEI hadith encoding process. In this context, we aim to develop a tool allowing the segmentation of hadith text from Sahih al-Bukhari corpus. To achieve this objective, we start by identifying different hadith structures. Then, we elaborate an automatic processing tool for hadith segmentation. This tool will be integrated in a prototype allowing the TEI encoding process. The experimentation and the evaluation of this tool is based on Sahih al-Bukhari corpus. The obtained results were encouraging despite some flaws related to exceptional cases of hadith structure.
Document type :
Conference papers
Liste complète des métadonnées

Cited literature [11 references]  Display  Hide  Download

https://hal.archives-ouvertes.fr/hal-01794105
Contributor : Hajer Maraoui <>
Submitted on : Thursday, May 17, 2018 - 11:41:38 AM
Last modification on : Friday, March 22, 2019 - 2:22:12 PM
Document(s) archivé(s) le : Wednesday, September 26, 2018 - 1:28:09 AM

File

Article AISI 2018.pdf
Files produced by the author(s)

Identifiers

  • HAL Id : hal-01794105, version 1

Collections

Citation

Hajer Maraoui, Kais Haddar, Laurent Romary. Segmentation tool for hadith corpus to generate TEI encoding. 4th International Conference on Advanced Intelligent Systems and Informatics (AISI’18), Sep 2018, Cairo, Egypt. ⟨hal-01794105⟩

Share

Metrics

Record views

128

Files downloads

527