Skip to Main content Skip to Navigation
Conference papers

Segmentation tool for hadith corpus to generate TEI encoding

Abstract : A segmentation tool for a hadith corpus is necessary to prepare the TEI hadith encoding process. In this context, we aim to develop a tool allowing the segmentation of hadith text from Sahih al-Bukhari corpus. To achieve this objective, we start by identifying different hadith structures. Then, we elaborate an automatic processing tool for hadith segmentation. This tool will be integrated in a prototype allowing the TEI encoding process. The experimentation and the evaluation of this tool is based on Sahih al-Bukhari corpus. The obtained results were encouraging despite some flaws related to exceptional cases of hadith structure.
Document type :
Conference papers
Complete list of metadata

Cited literature [11 references]  Display  Hide  Download
Contributor : Hajer Maraoui <>
Submitted on : Thursday, May 17, 2018 - 11:41:38 AM
Last modification on : Wednesday, October 28, 2020 - 2:20:04 PM
Long-term archiving on: : Wednesday, September 26, 2018 - 1:28:09 AM


Article AISI 2018.pdf
Files produced by the author(s)


  • HAL Id : hal-01794105, version 1



Hajer Maraoui, Kais Haddar, Laurent Romary. Segmentation tool for hadith corpus to generate TEI encoding. 4th International Conference on Advanced Intelligent Systems and Informatics (AISI’18), Sep 2018, Cairo, Egypt. ⟨hal-01794105⟩



Record views


Files downloads