Abstract : TheÉ:CALM resource is constructed from French student texts produced in a variety of usual contexts of teaching. The distinction of theÉ:CALM resource is to provide an ecological data set that gives a broad overview of texts written at elementary school, high school and university. This paper describes the whole data processing: encoding of the main graphical aspects of the handwritten primary sources according to the TEI-P5 norm; spelling standardizing; POS tagging and syntactic parsing evaluation.
https://hal.archives-ouvertes.fr/hal-02868859 Contributor : Claude PontonConnect in order to contact the contributor Submitted on : Monday, June 29, 2020 - 3:52:21 PM Last modification on : Wednesday, November 17, 2021 - 12:31:08 PM
Lydia-Mai Ho-Dac, Serge Fleury, Claude Ponton. E:Calm Resource: a Resource for Studying Texts Produced by French Pupils and Students. Language Resources and Evaluation Conference, LREC, 12, May 2020, Marseille, France. pp.4327-4332. ⟨hal-02868859⟩