This and that in native and learner English: From typology of use to tagset characterisation - Archive ouverte HAL Accéder directement au contenu
Chapitre D'ouvrage Année : 2013

This and that in native and learner English: From typology of use to tagset characterisation

Résumé

Learner corpus research is now faced with a multiplicity of tagsets. It is therefore difficult to carry out cross-corpus analysis due to the variety of tags used for each part-of-speech (POS). In this paper, we envisage this issue through a specific linguistic point. We propose a typology of uses in both native and non-native corpora. Various tagsets are analysed so as to measure the relevance of the linguistic information provided for this and that. Overall, a comparative analysis of this and that in tagsets is proposed and the benefits and flaws of manual fine-grained annotation versus automatic annotation are assessed. This study comes as a first step towards automated annotation of this and that in various corpora as this process would pave the way to corpus interoperability at POS level.

Domaines

Linguistique
Fichier principal
Vignette du fichier
LCR2011_proceedings_Gaillat_2013.pdf (184.99 Ko) Télécharger le fichier
Origine : Fichiers produits par l'(les) auteur(s)
Loading...

Dates et versions

hal-01171279 , version 1 (19-05-2017)

Identifiants

  • HAL Id : hal-01171279 , version 1

Citer

Thomas Gaillat. This and that in native and learner English: From typology of use to tagset characterisation. Granger Sylviane; Gilquin Gaëtanelle; Meunier Fanny. Twenty years of learner research: looking back, moving ahead Proceedings of the First Learner Corpus Research Conference (LCR 2011), Presses Universitaires de Louvain, 2013. ⟨hal-01171279⟩
36 Consultations
64 Téléchargements

Partager

Gmail Facebook X LinkedIn More