Learning Constrained Edit State Machines

Abstract : Learning the parameters of the edit distance has been increasingly studied during the past few years to improve the assessment of similarities between structured data, such as strings, trees or graphs. Often based on the optimization of the likelihood of pairs of data, the learned models usually take the form of probabilistic state machines, such as pair-Hidden Markov Models (pair-HMM), stochastic transducers, or probabilistic deterministic automata. Although the use of such models has lead to significant improvements of edit distance-based classification tasks, a new challenge has appeared on the horizon: How integrating background knowledge during the learning process? This is the subject matter of this paper in the case of (input,output) pairs of strings. We present a generalization of the pair-HMM in the form of a constrained state machine, where a transition between two states is driven by constraints fulfilled on the input string. Experimental results are provided on a task in molecular biology, aiming to detect transcription factor binding sites.
Document type :
Conference papers
Complete list of metadatas

Cited literature [20 references]  Display  Hide  Download

Contributor : Marc Sebban <>
Submitted on : Friday, May 21, 2010 - 9:25:12 AM
Last modification on : Wednesday, July 25, 2018 - 2:05:31 PM
Long-term archiving on : Thursday, September 16, 2010 - 3:06:01 PM


Files produced by the author(s)


  • HAL Id : hal-00485560, version 1


Laurent Boyer, Olivier Gandrillon, Amaury Habrard, Mathilde Pellerin, Marc Sebban. Learning Constrained Edit State Machines. 21st IEEE International Conference on Tools with Artificial Intelligence, Nov 2009, United States. pp.734-741. ⟨hal-00485560⟩



Record views


Files downloads