Multiway Tensor Factorization for Unsupervised Lexical Acquisition - Archive ouverte HAL Accéder directement au contenu
Communication Dans Un Congrès Année : 2012

Multiway Tensor Factorization for Unsupervised Lexical Acquisition

Résumé

This paper introduces a novel method for joint unsupervised aquisition of verb subcategorization frame (SCF) and selectional preference (SP) information. Treating SCF and SP induction as a multi-way co-occurrence problem, we use multi-way tensor factorization to cluster frequent verbs from a large corpus according to their syntactic and semantic behaviour. The method extends previous tensor factorization approaches by predicting whether a syntactic argument is likely to occur with a verb lemma (SCF) as well as which lexical items are likely to occur in the argument slot (SP), and integrates a variety of lexical and syntactic features, including co-occurrence information on grammatical relations not explicitly represented in the SCFs. The SCF lexicon that emerges from the clusters achieves an F-score of 68.7 against a gold standard, while the SP model achieves an accuracy of 77.8 in a novel evaluation that considers all of a verb's arguments simultaneously.
Fichier principal
Vignette du fichier
VandeCruysEtAl2012Multi.pdf (167.1 Ko) Télécharger le fichier
Origine : Fichiers produits par l'(les) auteur(s)
Loading...

Dates et versions

hal-00783711 , version 1 (05-02-2013)

Identifiants

  • HAL Id : hal-00783711 , version 1

Citer

Tim van de Cruys, Laura Rimell, Thierry Poibeau, Anna Korhonen. Multiway Tensor Factorization for Unsupervised Lexical Acquisition. COLING 2012, Dec 2012, Mumbai, India. pp.2703-2720. ⟨hal-00783711⟩
262 Consultations
177 Téléchargements

Partager

Gmail Facebook X LinkedIn More