Computationally efficient discrimination between language varieties with large feature vectors and regularized classifiers - Archive ouverte HAL Accéder directement au contenu
Communication Dans Un Congrès Année : 2018

Computationally efficient discrimination between language varieties with large feature vectors and regularized classifiers

Résumé

The present contribution revolves around efficient approaches to language classification which have been field-tested in the Vardial evaluation campaign. The methods used in several language identification tasks comprising different language types are presented and their results are discussed, giving insights on real-world application of regularization, linear classifiers and corresponding linguistic features. The use of a specially adapted Ridge classifier proved useful in 2 tasks out of 3. The overall approach (XAC) has slightly outperformed most of the other systems on the DFS task (Dutch and Flemish) and on the ILI task (Indo-Aryan languages), while its comparative performance was poorer in on the GDI task (Swiss German dialects).
Fichier principal
Vignette du fichier
Barbaresi_VarDial2018_regularized.pdf (168.72 Ko) Télécharger le fichier
Origine : Fichiers éditeurs autorisés sur une archive ouverte
Loading...

Dates et versions

hal-01858444 , version 1 (20-08-2018)

Licence

Paternité

Identifiants

  • HAL Id : hal-01858444 , version 1

Citer

Adrien Barbaresi. Computationally efficient discrimination between language varieties with large feature vectors and regularized classifiers. Fifth Workshop on NLP for Similar Languages, Varieties and Dialects, Aug 2018, Santa Fe, New Mexico, United States. pp.164-171. ⟨hal-01858444⟩
58 Consultations
24 Téléchargements

Partager

Gmail Facebook X LinkedIn More