N. Abe and M. Warmuth, On the computational complexity of approximating distributions by probabilistic automata, Machine Learning, vol.27, issue.2-3, pp.205-260, 1992.
DOI : 10.1007/BF00992677

D. Angluin, Identifying languages from stochastic examples, 1988.

R. Bailly, QWA: Spectral algorithm, Conference Proceedings, pp.147-163, 2011.

R. Bailly, F. Denis, and L. Ralaivola, Grammatical inference as a principal component analysis problem, Proceedings of the 26th Annual International Conference on Machine Learning, ICML '09, pp.33-40, 2009.
DOI : 10.1145/1553374.1553379

B. Balle, J. Castro, and R. Gavaldà, Bootstrapping and learning PDFA in data streams, Conference Proceedings, ICGI'12, pp.34-48, 2012.

L. E. Baum, An inequality and associated maximization technique in statistical estimation for probabilistic functions of Markov processes, Inequalities, vol.3, pp.1-8, 1972.

L. E. Baum, T. Petrie, G. Soules, and N. Weiss, A Maximization Technique Occurring in the Statistical Analysis of Probabilistic Functions of Markov Chains, The Annals of Mathematical Statistics, vol.41, issue.1, pp.164-171, 1970.
DOI : 10.1214/aoms/1177697196

A. Beimel, F. Bergadano, N. H. Bshouty, E. Kushilevitz, and S. Varricchio, Learning functions represented as multiplicity automata, Journal of the ACM, vol.47, issue.3, pp.506-530, 2000.
DOI : 10.1145/337244.337257

URL : http://citeseerx.ist.psu.edu/viewdoc/summary?doi=10.1.1.133.2653

F. Bergadano and S. Varricchio, Learning Behaviors of Automata from Multiplicity and Equivalence Queries, SIAM Journal on Computing, vol.25, issue.6, pp.1268-1280, 1996.
DOI : 10.1137/S009753979326091X

D. Blei and M. Jordan, Variational inference for Dirichlet process mixtures, Bayesian Analysis, vol.1, issue.1, pp.121-143, 2006.
DOI : 10.1214/06-BA104

J. Borges and M. Levene, Data Mining of User Navigation Patterns, Web Usage Mining and User Profiling, number 1836 in Lncs, pp.92-111, 2000.
DOI : 10.1007/3-540-44934-5_6

E. Brill, R. Florian, J. C. Henderson, and L. Mangu, Beyond n-grams: Can linguistic sophistication improve language modeling, Proc. of COLING- ACL-98, pp.186-190, 1998.

R. C. Carrasco, M. Forcada, and L. Santamaria, Inferring stochastic regular grammars with recurrent neural networks, Proceedings of ICGI'96, volume 1147 of Lnai, pp.274-281, 1996.
DOI : 10.1007/BFb0033361

URL : http://citeseerx.ist.psu.edu/viewdoc/summary?doi=10.1.1.43.5493

R. C. Carrasco and J. Oncina, Learning stochastic regular grammars by means of a state merging method, Proceedings of ICGI'94 of Lnai, pp.139-150, 1994.
DOI : 10.1007/3-540-58473-0_144

R. C. Carrasco, J. Oncina, and J. Calera-rubio, Stochastic inference of regular tree languages, Machine Learning Journal, vol.44, issue.1, pp.185-197, 2001.
DOI : 10.1007/BFb0054075

J. Castro and R. Gavaldá, Towards Feasible PAC-Learning of Probabilistic Deterministic Finite Automata, Proceedings of ICGI'08, pp.163-174, 2008.
DOI : 10.1007/978-3-540-88009-7_13

S. F. Chen and J. Goodman, An empirical study of smoothing techniques for language modeling, Acl, pp.310-318, 1996.

A. Clark and F. Thollard, Pac-learnability of probabilistic deterministic finite state automata, Journal of Machine Learning Research, vol.5, pp.473-497, 2004.

T. Cover and J. Thomas, Elements of Information Theory, 1991.

P. Cruz-alcázar and E. Vidal, TWO GRAMMATICAL INFERENCE APPLICATIONS IN MUSIC PROCESSING, Applied Artificial Intelligence, vol.27, issue.1-2, pp.53-76, 2008.
DOI : 10.1016/S0022-0000(72)80020-5

C. De and . Higuera, Grammatical inference: learning automata and grammars, 2010.
URL : https://hal.archives-ouvertes.fr/hal-00476128

C. De-la-higuera and J. Oncina, Identification with Probability One of Stochastic Deterministic Linear Languages, Proceedings of ALT'03, volume 2842 of Lncs, pp.134-148, 2003.
DOI : 10.1007/978-3-540-39624-6_20

C. De-la-higuera and J. Oncina, Learning probabilistic finite automata, Proceedings of ICGI'04 of Lnai, pp.175-186, 2004.
DOI : 10.1017/CBO9781139194655.017

C. De-la-higuera and F. Thollard, Identication in the limit with probability one of stochastic deterministic finite automata, Proceedings of ICGI'00, volume 1891 of Lnai, pp.15-24, 2000.

F. Denis and Y. Esposito, Learning Classes of Probabilistic Automata, Proceedings of Colt, 2004.
DOI : 10.1007/978-3-540-27819-1_9

F. Denis, Y. Esposito, and A. Habrard, Learning Rational Stochastic Languages, Proceedings of Colt 2006, pp.274-288, 2006.
DOI : 10.1007/11776420_22

URL : https://hal.archives-ouvertes.fr/hal-00019161

F. Denis, A. Lemay, and A. Terlutte, Learning Regular Languages Using Non Deterministic Finite Automata, Proceedings of ICGI'00, volume 1891 of Lnai, pp.39-50, 2000.
DOI : 10.1007/978-3-540-45257-7_4

F. Denis, A. Lemay, and A. Terlutte, Learning regular languages using RFSA, Proceedings of ALT'01, volume 2225 of Lncs, pp.348-363, 2001.
DOI : 10.1007/3-540-45583-3_26

URL : http://citeseerx.ist.psu.edu/viewdoc/summary?doi=10.1.1.12.5641

P. Dupont and J. Amengual, Smoothing probabilistic automata: an errorcorrecting approach, Proceedings of ICGI'00, volume 1891 of Lnai, pp.51-62, 2000.
DOI : 10.1007/978-3-540-45257-7_5

URL : http://biblio.info.ucl.ac.be/2000/272756.pdf

P. Dupont, F. Denis, and Y. Esposito, Links between probabilistic automata and hidden Markov models: probability distributions, learning models and induction algorithms, Pattern Recognition, vol.38, issue.9, pp.1349-1371, 2005.
DOI : 10.1016/j.patcog.2004.03.020

Y. Esposito, A. Lemay, F. Denis, and P. Dupont, Learning Probabilistic Residual Finite State Automata, Proceedings of ICGI'02, volume 2484 of Lnai, pp.77-91, 2002.
DOI : 10.1007/3-540-45790-9_7

URL : http://citeseerx.ist.psu.edu/viewdoc/summary?doi=10.1.1.12.6820

J. Gao and M. Johnson, A comparison of Bayesian estimators for unsupervised Hidden Markov Model POS taggers, Proceedings of the Conference on Empirical Methods in Natural Language Processing, EMNLP '08, pp.344-352, 2008.
DOI : 10.3115/1613715.1613761

R. Gavaldà, P. W. Keller, J. Pineau, and D. Precup, PAC-Learning of Markov Models with Hidden State, Proceedings of ECML'06, pp.150-161, 2006.
DOI : 10.1007/11871842_18

A. Gelfand and A. Smith, Sampling-Based Approaches to Calculating Marginal Densities, Journal of the American Statistical Association, vol.4, issue.410, pp.398-409, 1990.
DOI : 10.1080/01621459.1986.10478240

URL : http://citeseerx.ist.psu.edu/viewdoc/summary?doi=10.1.1.512.2330

D. Gildea and D. Jurafsky, Learning bias and phonological-rule induction, Computational Linguistics, vol.22, pp.497-530, 1996.

T. Goan, N. Benson, and O. Etzioni, A grammar inference algorithm for the world wide web, Proceedings of Aaai Spring Symposium on Machine Learning in Information Access, 1996.

P. Grünwald, The minimum description length principle, 2007.

O. Guttman, Probabilistic Automata and Distributions over Sequences, 2006.

O. Guttman, S. V. Vishwanathan, and R. C. Williamson, Learnability of Probabilistic Automata via Oracles, Proceedings of ALT'05, pp.171-182, 2005.
DOI : 10.1007/11564089_15

A. Habrard, M. Bernard, and M. Sebban, Improvement of the State Merging Rule on Noisy Data in Probabilistic Grammatical Inference, Proceedings of ECML'03, volume 2837 of Lnai, pp.169-1180, 2003.
DOI : 10.1007/978-3-540-39857-8_17

A. Habrard, F. Denis, and Y. Esposito, Using Pseudo-stochastic Rational Languages in Probabilistic Grammatical Inference, Proceedings of ICGI'06 of Lnai, pp.112-124, 2006.
DOI : 10.1007/11872436_10

URL : https://hal.archives-ouvertes.fr/hal-00085176

M. Heule and S. Verwer, Exact DFA Identification Using SAT Solvers, Proceedings of ICGI'10, pp.66-79, 2010.
DOI : 10.1007/978-3-642-15488-1_7

URL : http://citeseerx.ist.psu.edu/viewdoc/summary?doi=10.1.1.415.7380

J. J. Horning, A study of Grammatical Inference, 1969.

M. Hulden, Treba: Efficient numerically stable EM for Pfa, Conference Proceedings ICGI'12, pp.249-253, 2012.

A. Hasan-ibne, A. Batard, C. De-la-higuera, and C. Eckert, Psma: A parallel algorithm for learning regular languages, NIPS Workshop on Learning on Cores, Clusters and Clouds, 2010.

F. Jelinek, Statistical Methods for Speech Recognition, 1998.

M. J. Kearns, Y. Mansour, D. Ron, R. Rubinfeld, R. E. Schapire et al., On the learnability of discrete distributions, Proceedings of the twenty-sixth annual ACM symposium on Theory of computing , STOC '94, pp.273-282, 1994.
DOI : 10.1145/195058.195155

M. J. Kearns and U. Vazirani, An Introduction to Computational Learning Theory, 1994.

F. Kepler, S. Mergen, and C. Billa, Simple variable length n-grams for probabilistic automata learning, Conference Proceedings, ICGI'12, pp.254-258, 2012.

S. Kullback and R. A. Leibler, On Information and Sufficiency, The Annals of Mathematical Statistics, vol.22, issue.1, pp.79-86, 1951.
DOI : 10.1214/aoms/1177729694

K. J. Lang, B. A. Pearlmutter, and R. A. Price, Results of the Abbadingo one DFA learning competition and a new evidence-driven state merging algorithm, Proceedings of ICGI'98, volume 1433 of Lnai, pp.1-12, 1998.
DOI : 10.1007/BFb0054059

D. Lee and M. Yannakakis, Principles and methods of testing finite state machines-a survey, Proceedings of the IEEE, vol.84, issue.8, pp.1090-1123, 1996.
DOI : 10.1109/5.533956

P. Milani-comparetti, G. Wondracek, C. Kruegel, and E. Kirda, Prospex: Protocol Specification Extraction, 2009 30th IEEE Symposium on Security and Privacy, 2009.
DOI : 10.1109/SP.2009.14

M. Mohri, Finite-state transducers in language and speech processing, Computational Linguistics, vol.23, issue.3, pp.269-311, 1997.

R. Neal, Markov chain sampling methods for dirichlet process mixture models, Journal of computational and graphical statistics, vol.9, issue.2, pp.249-265, 2000.
DOI : 10.1080/10618600.2000.10474879

URL : http://citeseerx.ist.psu.edu/viewdoc/summary?doi=10.1.1.115.3668

H. Ney, S. Martin, and F. Wessel, Corpus-Based Statiscal Methods in Speech and Language Processing, chapter Statistical Language Modeling Using Leaving- One-Out, pp.174-207, 1997.

N. Palmer and P. W. Goldberg, Pac-learnability of probabilistic deterministic finite state automata in terms of variation distance, Proceedings of ALT'05, pp.157-170, 2005.

J. R. Partington, An Introduction to Hankel Operators, 1988.
DOI : 10.1017/CBO9780511623769

A. Paz, Introduction to probabilistic automata, 1971.

L. Rabiner, A tutorial on hidden Markov models and selected applications in speech recoginition, Proceedings of the Ieee, pp.257-286, 1989.

R. L. Rivest and R. E. Schapire, Inference of Finite Automata Using Homing Sequences, Information and Computation, vol.103, issue.2, pp.299-347, 1993.
DOI : 10.1006/inco.1993.1021

D. Ron, Y. Singer, and N. Tishby, Learning probabilistic automata with variable memory length, Proceedings of the seventh annual conference on Computational learning theory , COLT '94, pp.35-46, 1994.
DOI : 10.1145/180139.181006

URL : http://citeseerx.ist.psu.edu/viewdoc/summary?doi=10.1.1.160.2346

D. Ron, Y. Singer, and N. Tishby, On the learnability and usage of acyclic probabilistic finite automata, Proceedings of Colt 1995, pp.31-40, 1995.

Y. Sakakibara, Grammatical inference in bioinformatics, IEEE Transactions on Pattern Analysis and Machine Intelligence, vol.27, issue.7, pp.1051-1062, 2005.
DOI : 10.1109/TPAMI.2005.140

A. Sanjeev and B. Boaz, Computational Complexity: A Modern Approach, 2009.

L. Saul and F. Pereira, Aggregate and mixed-order Markov models for statistical language processing, Proceedings of the Second Conference on Empirical Methods in Natural Language Processing, pp.81-89, 1997.

C. R. Shalizi and J. P. Crutchfield, Computational mechanics: Pattern and prediction, structure and simplicity, Journal of Statistical Physics, vol.104, issue.3/4, pp.817-879, 2001.
DOI : 10.1023/A:1010388907793

C. Shibata and R. Yoshinaka, Marginalizing out transition probabilities for several subclasses of Pfas, Conference Proceedings, ICGI'12, pp.259-263, 2012.

A. Stolcke, Bayesian Learning of Probabilistic Language Models, 1994.

A. Sudkamp, Languages and Machines: an introduction to the theory of computer science, 2006.

F. Thollard, Improving probabilistic grammatical inference core algorithms with post-processing techniques, Proceedings of ICML'01, pp.561-568

F. Thollard and A. Clark, Pac-learnability of probabilistic deterministic finite state automata, Journal of Machine Learning Research, vol.5, pp.473-497, 2004.

F. Thollard and P. Dupont, Entropie relative et algorithmes d'inférence grammaticale probabiliste, Actes de la conférence Cap '99, pp.115-122, 1999.

F. Thollard, P. Dupont, C. De, and . Higuera, Probabilistic Dfa inference using Kullback-Leibler divergence and minimality, Proceedings of ICML'00, pp.975-982, 2000.

S. Verwer, M. De-weerdt, and C. Witteveen, Learning driving behavior by timed syntactic pattern recognition, IJCAI'11, pp.1529-1534, 2011.

S. Verwer, M. Weerdt, and C. Witteveen, A Likelihood-Ratio Test for Identifying Probabilistic Deterministic Real-Time Automata from Positive Data, Proceedings of ICGI'10, pp.203-216, 2010.
DOI : 10.1007/978-3-642-15488-1_17

E. Vidal, F. Thollard, C. De-la-higuera, F. Casacuberta, and R. C. Carrasco, Probabilistic finite state automata ? part I. Pattern Analysis and Machine Intelligence, pp.1013-1025, 2005.
DOI : 10.1109/tpami.2005.147

URL : http://citeseerx.ist.psu.edu/viewdoc/summary?doi=10.1.1.1.3277

N. Walkinshaw, B. Lambeau, C. Damas, K. Bogdanov, and P. Dupont, STAMINA: a competition to encourage the development and assessment of software model inference techniques, Empirical Software Engineering, vol.11, issue.2, pp.1-34, 2012.
DOI : 10.1007/s10664-012-9210-3

Y. Tzay and . Young, Handbook of Pattern Recognition and Image Processing: Computer Vision, 1994.

M. Young-lai and F. W. Tompa, Stochastic grammatical inference of text database structure, Machine Learning, vol.40, issue.2, pp.111-137, 2000.
DOI : 10.1023/A:1007653929870

C. Zhai and J. Lafferty, A study of smoothing methods for language models applied to information retrieval, ACM Transactions on Information Systems, vol.22, issue.2, pp.179-214, 2004.
DOI : 10.1145/984321.984322