Bipartite spectral graph partitioning for clustering dialect varieties and detecting their linguistic features - Archive ouverte HAL Accéder directement au contenu
Article Dans Une Revue Computer Speech and Language Année : 2011

Bipartite spectral graph partitioning for clustering dialect varieties and detecting their linguistic features

Résumé

In this study we use bipartite spectral graph partitioning to simultaneously cluster varieties and identify their most distinctive linguistic features in Dutch dialect data. While clustering geographical varieties with respect to their features, e.g. pronunciation, is not new, the simultaneous identification of the features which give rise to the geographical clustering presents novel opportunities in dialectometry. Earlier methods aggregated sound differences and clustered on the basis of aggregate differences. The determination of the significant features which co-vary with cluster membership was carried out on a post hoc basis. Bipartite spectral graph clustering simultaneously seeks groups of individual features which are strongly associated, even while seeking groups of sites which share subsets of these same features. We show that the application of this method results in clear and sensible geographical groupings and discuss and analyze the importance of the concomitant features.
Fichier principal
Vignette du fichier
PEER_stage2_10.1016%2Fj.csl.2010.05.004.pdf (2.71 Mo) Télécharger le fichier
Origine : Fichiers produits par l'(les) auteur(s)
Loading...

Dates et versions

hal-00730283 , version 1 (09-09-2012)

Identifiants

Citer

Martijn Wieling, John Nerbonne. Bipartite spectral graph partitioning for clustering dialect varieties and detecting their linguistic features. Computer Speech and Language, 2011, 25 (3), pp.700. ⟨10.1016/j.csl.2010.05.004⟩. ⟨hal-00730283⟩

Collections

PEER
58 Consultations
189 Téléchargements

Altmetric

Partager

Gmail Facebook X LinkedIn More