Skip to Main content Skip to Navigation
Conference papers

Processing a Mayan Corpus for Enhancing our Knowledge of Ancient Scripts

Abstract : The ancient Maya writing comprises more than 500 signs, either syllabic or semantic, and is largely deciphered, with a variable degree of reliability. We applied to the Dresden Codex, one of the only three manuscripts that reached us, encoded for LATEX with the mayaTEX package, our graded representation method of hybrid non-supervised learning, intermediate between clustering and oblique factor analysis, and following Hellinger metrics, in order to obtain a nuanced image of themes dealt with: the statistical entities are the 214 codex segments, and their attributes are the 1687 extracted bigrams of signs. For comparison, we introduced in this approach an exogenous element, i.e. the splitting of the composed signs into their elements, for a finer elicitation of the contents. The results are visualized as a set of "thematic concordances": for each homogeneous semantic context, the most salient bigrams or sequences of bigrams are displayed in their textual environment, which sheds a new light on the meaning of some little understood glyphs, placing them in clearly understandable contexts.
Document type :
Conference papers
Complete list of metadata

Cited literature [31 references]  Display  Hide  Download
Contributor : Alain Lelu <>
Submitted on : Thursday, November 24, 2011 - 9:52:31 AM
Last modification on : Friday, April 2, 2021 - 3:36:59 AM
Long-term archiving on: : Saturday, February 25, 2012 - 2:20:06 AM


Files produced by the author(s)


  • HAL Id : hal-00577958, version 1


Bruno Delprat, Mohamed Hallab, Martine Cadot, Alain Lelu. Processing a Mayan Corpus for Enhancing our Knowledge of Ancient Scripts. 4th International Conference on Information Systems and Economic Intelligence - SIIE'2011, Feb 2011, Marrakech, Morocco. pp.198-208. ⟨hal-00577958⟩



Record views


Files downloads