Processing a Mayan Corpus for Enhancing our Knowledge of Ancient Scripts
Résumé
The ancient Maya writing comprises more than 500 signs, either syllabic or semantic, and is largely deciphered, with a variable degree of reliability. We applied to the Dresden Codex, one of the only three manuscripts that reached us, encoded for LATEX with the mayaTEX package, our graded representation method of hybrid non-supervised learning, intermediate between clustering and oblique factor analysis, and following Hellinger metrics, in order to obtain a nuanced image of themes dealt with: the statistical entities are the 214 codex segments, and their attributes are the 1687 extracted bigrams of signs. For comparison, we introduced in this approach an exogenous element, i.e. the splitting of the composed signs into their elements, for a finer elicitation of the contents. The results are visualized as a set of "thematic concordances": for each homogeneous semantic context, the most salient bigrams or sequences of bigrams are displayed in their textual environment, which sheds a new light on the meaning of some little understood glyphs, placing them in clearly understandable contexts.
Domaines
Autre [cs.OH]
Origine : Fichiers produits par l'(les) auteur(s)
Loading...