Using Contextual Representations to Efficiently Learn Context-Free Languages

Alexander Clark; Rémi Eyraud; Amaury Habrard

Article Dans Une Revue Journal of Machine Learning Research Année : 2010

Using Contextual Representations to Efficiently Learn Context-Free Languages

(1) , (2) , (2)

1
2

Alexander Clark

Fonction : Auteur
PersonId : 856797

Department of Computer Science

Rémi Eyraud

Fonction : Auteur

Laboratoire d'informatique Fondamentale de Marseille - UMR 6166

Amaury Habrard

Fonction : Auteur
PersonId : 439
IdHAL : amaury-habrard
ORCID : 0000-0003-3038-9347
IdRef : 084103655

Laboratoire d'informatique Fondamentale de Marseille - UMR 6166

Résumé

We present a polynomial update time algorithm for the inductive inference of a large class of context-free languages using the paradigm of positive data and a membership oracle. We achieve this result by moving to a novel representation, called Contextual Binary Feature Grammars (CBFGs), which are capable of representing richly structured context-free languages as well as some context sensitive languages. These representations explicitly model the lattice structure of the distribution of a set of substrings and can be inferred using a generalisation of distributional learning. This formalism is an attempt to bridge the gap between simple learnable classes and the sorts of highly expressive representations necessary for linguistic representation: it allows the learnability of a large class of context-free languages, that includes all regular languages and those context-free languages that satisfy two simple constraints. The formalism and the algorithm seem well suited to natural language and in particular to the modeling of first language acquisition. Preliminary experimental results confirm the effectiveness of this approach.

Mots clés

grammatical inference context-free language positive data only membership queries

Domaines

Apprentissage [cs.LG]

Fichier principal

Learning_CBFG.pdf (335.83 Ko)

Origine : Fichiers produits par l'(les) auteur(s)

Amaury Habrard : Connectez-vous pour contacter le contributeur

https://hal.science/hal-00607098

Soumis le : jeudi 7 juillet 2011-19:57:13

Dernière modification le : vendredi 24 mars 2023-14:52:54

Archivage à long terme le : lundi 12 novembre 2012-10:25:53

Dates et versions

hal-00607098 , version 1 (07-07-2011)

Identifiants

HAL Id : hal-00607098 , version 1

Citer

Alexander Clark, Rémi Eyraud, Amaury Habrard. Using Contextual Representations to Efficiently Learn Context-Free Languages. Journal of Machine Learning Research, 2010, 11, pp.2707-2744. ⟨hal-00607098⟩

Exporter

BibTeX XML-TEI Dublin Core DC Terms EndNote DataCite

Collections

LIF CNRS UNIV-AMU LIS-LAB

71 Consultations

76 Téléchargements

Using Contextual Representations to Efficiently Learn Context-Free Languages

Résumé

Mots clés

Domaines

Dates et versions

Identifiants

Citer

Exporter

Collections

Partager