Polynomial Identification in the limit of context-free substitutable languages

A. Clark; Rémi Eyraud

Article Dans Une Revue Journal of Machine Learning Research Année : 2007

Polynomial Identification in the limit of context-free substitutable languages

(1) , (2, 3)

1
2
3

A. Clark

Fonction : Auteur

Department of Computer Science

Rémi Eyraud

Fonction : Auteur
PersonId : 1233063
IdHAL : remi-eyraud
ORCID : 0000-0002-5728-4759

Laboratoire d'informatique Fondamentale de Marseille - UMR 6166

éQuipe AppRentissage et MultimediA [Marseille]

Résumé

This paper formalises the idea of substitutability introduced by Zellig Harris in the 1950s and makes it the basis for a learning algorithm from positive data only for a subclass of context-free languages. We show that there is a polynomial characteristic set, and thus prove polynomial identification in the limit of this class. We discuss the relationship of this class of languages to other common classes discussed in grammatical inference. It transpires that it is not necessary to identify constituents in order to learn a context-free language -- it is sufficient to identify the syntactic congruence, and the operations of the syntactic monoid can be converted into a context-free grammar. We also discuss modifications to the algorithm that produces a reduction system rather than a context-free grammar, that will be much more compact. We discuss the relationship to Angluin's notion of reversibility for regular languages. We also demonstrate that an implementation of this algorithm is capable of learning a classic example of structure dependent syntax in English: this constitutes a refutation of an argument that has been used in support of nativist theories of language.

Mots clés

natural languages string rewritting systems context-free languages grammar induction

Domaines

Apprentissage [cs.LG] Traitement du texte et du document

Fichier principal

JMLR_clark_eyraud_07.pdf (146.74 Ko)

Origine : Fichiers éditeurs autorisés sur une archive ouverte

Rémi Eyraud : Connectez-vous pour contacter le contributeur

https://hal.science/hal-00186889

Soumis le : mercredi 14 novembre 2007-10:38:42

Dernière modification le : lundi 4 mars 2024-15:36:49

Archivage à long terme le : lundi 12 avril 2010-01:58:35

Dates et versions

hal-00186889 , version 1 (14-11-2007)

Identifiants

HAL Id : hal-00186889 , version 1

Citer

A. Clark, Rémi Eyraud. Polynomial Identification in the limit of context-free substitutable languages. Journal of Machine Learning Research, 2007, 8, pp.1725--1745. ⟨hal-00186889⟩

Exporter

BibTeX XML-TEI Dublin Core DC Terms EndNote DataCite

Collections

LIF CNRS UNIV-AMU EC-MARSEILLE LIS-LAB

89 Consultations

380 Téléchargements

Polynomial Identification in the limit of context-free substitutable languages

Résumé

Mots clés

Domaines

Dates et versions

Identifiants

Citer

Exporter

Collections

Partager