Abstract : We present the first algorithm for learning n-ary node selection queries in trees from completely annotated examples by methods of grammatical inference. We propose to represent n-ary queries by deterministic n-ary node selecting tree transducers (NSTTs), that are known to capture the class of MSO-definable n-ary queries. Despite of this highly expressive, we show that n-aryy queries, selecting a polynomially bounded number of tuples per tree, represented by deterministic NSTTs can be learned from polynomial time and data while allowing for efficient enumeration of query answers. An application to wrapper induction in Web information extraction yields encouraging results.
https://hal.inria.fr/inria-00088077
Contributor : Joachim Niehren <>
Submitted on : Sunday, August 13, 2006 - 9:42:21 PM Last modification on : Tuesday, April 28, 2020 - 11:52:12 AM Long-term archiving on: : Monday, September 20, 2010 - 4:08:09 PM
Aurélien Lemay, Joachim Niehren, Rémi Gilleron. Learning n-ary Node Selecting Tree Transducers from Completely Annotated Examples. 8th International Colloquium on Grammatical Inference, Sep 2006, Tokyo, Japan. pp.253-267. ⟨inria-00088077v2⟩