Skip to Main content Skip to Navigation
New interface
Conference papers

Parsing Poorly Standardized Language Dependency on Old French

Abstract : This paper presents results of dependency parsing of Old French, a language which is poorly standardized at the lexical level, and which displays a relatively free word order. The work is carried out on five distinct sample texts extracted from the dependency treebank Syntactic Reference Corpus of Medieval French (SRCMF). Following Achim Stein's previous work, we have trained the Mate parser on each sub-corpus and cross-validated the results. We show that the parsing efficiency is diminished by the greater lexical variation of Old French compared to parse results on modern French. In order to improve the result of the POS tagging step in the parsing process, we applied a pre-treatment to the data, comparing two distinct strategies: one using a slightly post-treated version of the TreeTagger trained on Old French by Stein, and a CRF trained on the texts, enriched with external resources. The CRF version outperforms every other approach.
Complete list of metadata

Cited literature [10 references]  Display  Hide  Download
Contributor : Sophie PREVOST Connect in order to contact the contributor
Submitted on : Wednesday, January 6, 2016 - 1:55:19 PM
Last modification on : Thursday, March 17, 2022 - 10:08:40 AM
Long-term archiving on: : Thursday, November 10, 2016 - 9:37:56 PM


Files produced by the author(s)


  • HAL Id : hal-01250959, version 2



Gaël Guibon, Isabelle Tellier, Mathieu Constant, Sophie Prévost, Kim Gerdes. Parsing Poorly Standardized Language Dependency on Old French. Thirteenth International Workshop on Treebanks and Linguistic Theories (TLT13), Dec 2014, Tübingen, Germany. pp.51-61. ⟨hal-01250959v2⟩



Record views


Files downloads