Using Contextual Representations to Efficiently Learn Context-Free Languages

Abstract : We present a polynomial update time algorithm for the inductive inference of a large class of context-free languages using the paradigm of positive data and a membership oracle. We achieve this result by moving to a novel representation, called Contextual Binary Feature Grammars (CBFGs), which are capable of representing richly structured context-free languages as well as some context sensitive languages. These representations explicitly model the lattice structure of the distribution of a set of substrings and can be inferred using a generalisation of distributional learning. This formalism is an attempt to bridge the gap between simple learnable classes and the sorts of highly expressive representations necessary for linguistic representation: it allows the learnability of a large class of context-free languages, that includes all regular languages and those context-free languages that satisfy two simple constraints. The formalism and the algorithm seem well suited to natural language and in particular to the modeling of first language acquisition. Preliminary experimental results confirm the effectiveness of this approach.
Document type :
Journal articles
Complete list of metadatas

Cited literature [36 references]  Display  Hide  Download

https://hal.archives-ouvertes.fr/hal-00607098
Contributor : Amaury Habrard <>
Submitted on : Thursday, July 7, 2011 - 7:57:13 PM
Last modification on : Thursday, June 27, 2019 - 1:36:06 PM
Long-term archiving on : Monday, November 12, 2012 - 10:25:53 AM

File

Learning_CBFG.pdf
Files produced by the author(s)

Identifiers

  • HAL Id : hal-00607098, version 1

Collections

Citation

Alexander Clark, Rémi Eyraud, Amaury Habrard. Using Contextual Representations to Efficiently Learn Context-Free Languages. Journal of Machine Learning Research, Microtome Publishing, 2010, 11, pp.2707-2744. ⟨hal-00607098⟩

Share

Metrics

Record views

218

Files downloads

98