Investigating phonological theories with crowd-sourced data: The Inventory Size Hypothesis in the light of Lingua Libre - Archive ouverte HAL Accéder directement au contenu
Communication Dans Un Congrès Année : 2022

Investigating phonological theories with crowd-sourced data: The Inventory Size Hypothesis in the light of Lingua Libre

Résumé

Data-driven research in phonetics and phonology relies massively on oral resources, and access thereto. We propose to explore a question in comparative linguistics using an open-source crowd-sourced corpus, Lingua Libre, Wikimedia’s participatory linguistic library, to show that such corpora may offer a solution to typologists wishing to explore numerous languages at once. For the present proof of concept, we compare the realizations of Italian and Spanish vowels (sample size = 5000) to investigate whether vowel production is influenced by the size of the phonemic inventory (the Inventory Size Hypothesis), by the exact shape of the inventory (the Vowel Quality Hypothesis) or by none of the above. Results show that the size of the inventory does not seem to influence vowel production, thus supporting previous research, but also that the shape of the inventory may well be a factor determining the extent of variation in vowel production. Most of all, these results show that Lingua Libre has the potential to provide valuable data for linguistic inquiry.

Domaines

Linguistique
Fichier principal
Vignette du fichier
2022.sigmorphon-1.3.pdf (879 Ko) Télécharger le fichier
Origine : Fichiers produits par l'(les) auteur(s)

Dates et versions

hal-03725715 , version 1 (08-12-2022)

Identifiants

Citer

Mathilde Hutin, Marc Allassonnière-Tang. Investigating phonological theories with crowd-sourced data: The Inventory Size Hypothesis in the light of Lingua Libre. 19th SIGMORPHON Workshop on Computational Research in Phonetics, Phonology, and Morphology, Jul 2022, Seattle, United States. pp.23-28, ⟨10.18653/v1/2022.sigmorphon-1.3⟩. ⟨hal-03725715⟩
32 Consultations
33 Téléchargements

Altmetric

Partager

Gmail Facebook X LinkedIn More