106 articles – 48 references  [version française]
HAL: inria-00294636, version 1

See short view  BibTeX,EndNote,...
On Probability Distributions for Trees: Representations, Inference and Learning
Denis F., Habrard A., Gilleron R., Tommasi M., Gilbert E.
NIPS Workshop on Representations and Inference on Probability Distributions, Whistler : Canada (2007) - http://hal.inria.fr/inria-00294636
Congres communications
Computer Science/Learning
On Probability Distributions for Trees: Representations, Inference and Learning
François Denis () 1, Amaury Habrard () 1, Rémi Gilleron () 2, 3, Marc Tommasi () 2, 3, 4, Édouard Gilbert () 3
1:  Laboratoire d'informatique Fondamentale de Marseille (LIF)
http://www.lif.univ-mrs.fr/
CNRS : UMR6166 – Université de la Méditerranée - Aix-Marseille II – Université de Provence - Aix-Marseille I
CMI 39, Rue Joliot Curie 13453 MARSEILLE CEDEX 13
France
2:  Laboratoire d'Informatique Fondamentale de Lille (LIFL)
http://www.lifl.fr/
CNRS : UMR8022 – Université Lille I - Sciences et technologies – Université Lille III - Sciences humaines et sociales – INRIA
Bâtiment M3 59655 Villeneuve d'Ascq Cédex
France
3:  MOSTRARE (INRIA Futurs)
INRIA – CNRS : UMR8022 – Université Lille I - Sciences et technologies – Université Lille III - Sciences humaines et sociales
France
4:  GRAPPA (LIFL)
http://www.grappa.univ-lille3.fr
CNRS : UMR8022 – Université Lille III - Sciences humaines et sociales – Université Lille I - Sciences et technologies
Maison de la recherche, domaine Universitaire Pont de Bois Université Charles de Gaulle - Lille 3 59653 Villeneuve d'Ascq CEDEX
France
We study probability distributions over free algebras of trees. Probability distributions can be seen as particular (formal power) tree series [Berstel et al 82, Esik et al 03], i.e. mappings from trees to a semiring K . A widely studied class of tree series is the class of rational (or recognizable) tree series which can be defined either in an algebraic way or by means of multiplicity tree automata. We argue that the algebraic representation is very convenient to model probability distributions over a free algebra of trees. First, as in the string case, the algebraic representation allows to design learning algorithms for the whole class of probability distributions defined by rational tree series. Note that learning algorithms for rational tree series correspond to learning algorithms for weighted tree automata where both the structure and the weights are learned. Second, the algebraic representation can be easily extended to deal with unranked trees (like XML trees where a symbol may have an unbounded number of children). Both properties are particularly relevant for applications: nondeterministic automata are required for the inference problem to be relevant (recall that Hidden Markov Models are equivalent to nondeterministic string automata); nowadays applications for Web Information Extraction, Web Services and document processing consider unranked trees.
F.: Theory of Computation/F.4: MATHEMATICAL LOGIC AND FORMAL LANGUAGES/F.4.3: Formal Languages/F.4.3.1: Classes defined by grammars or automata (e.g., context-free languages, regular sets, recursive sets)
I.: Computing Methodologies/I.5: PATTERN RECOGNITION/I.5.1: Models/I.5.1.5: Structural
English

2007
international
NIPS Workshop on Representations and Inference on Probability Distributions
Whistler
Canada
2007-12-08

Tree automata – tree series – probability distributions – weighted tree automata – machine learning
Project Id ANR-05-MMSA-0016
Year 2005
Project acronyme marmota
Project title Apprentissage automatique, modèles probabilistes et langages d'arbres
Intitule Masse de données : Modélisation, Simulation, Applications
Acronyme MMSA
Attached file list to this document: 
TEX
nips07.tex(13.9 KB)
ijcai07.sty(13.2 KB)
nips07.bbl(2.2 KB)
xcolor.sty(53.9 KB)
PDF
nips07.pdf(107.3 KB)
PS
nips07.ps(105.4 KB)