A. R. and S. M. Sebag-m, Preference-based policy learning, Machine Learning and Knowledge Discovery in Databases, pp.12-27, 2011.

A. R. and S. M. Sebag-m, APRIL: Active preference learning-based reinforcement learning, Machine Learning and Knowledge Discovery in Databases, pp.116-131, 2012.

A. B. , C. C. , and C. C. Giroux-p, WebLab PROV: Computing fine-grained provenance links for XML artifacts, BIGProv'13 Workshop (in conjunction with EDBT/ICDT), pp.298-306, 2013.

A. A. , R. Z. Kraus-s, . V. Goldman-c, and . Gal-y, Strategic advice provision in repeated human-agent interactions, p.1500, 2012.

B. R. Tennenholtz-m, R-max-a general polynomial time algorithm for nearoptimal reinforcement learning, The Journal of Machine Learning Research, vol.3, pp.213-231, 2003.

B. I. Suc-d.-fekete-r, . Szörényi-b, C. W. Weng-p, and . Hüllermeier-e, Learning qualitative models BUSA Preference-based reinforcement learning: evolutionary direct policy search using a preference-based racing algorithm, Machine Learning, pp.97-327, 2003.

C. C. Amann-b, C. C. Giroux-p, and . Santanchè-a, Provenance-based quality assessment and inference in data-centric workflow executions, On the Move to Meaningful Internet Systems: OTM 2014 Conferences, pp.130-147, 2014.

C. W. Dalvi-b and . D. Cohen-b, Very Fast Similarity Queries on Semi-Structured Data from the Web, SDM, pp.512-520, 2013.
DOI : 10.1137/1.9781611972832.57

C. H. , M. D. Bontcheva-k, A. N. Tablan-v, . Roberts-i, . Gorrell-g et al., Developing Language Processing Components with GATE Version 8 (a User Guide). https, pp.2014-2026, 2014.

D. J. , J. C. , and C. J. Falkowski-m, Knowledge-based highlyspecialized terrorist event extraction. RuleML2013 Challenge, Human Language Technology and Doctoral Consortium, p.1, 2013.

F. L. Barrón-cedeño, . Màrquez-l, . A. Henríquez-c, and . B. Mariño-j, Leveraging online user feedback to improve statistical machine translation, Journal of Artificial Intelligence Research, vol.54, pp.159-192, 2015.

F. M. Bobrow-d and . J. De, Model-based computing for design and control of reconfigurable systems, AI magazine, vol.24, issue.120, 2003.

F. J. Hüllermeier-e and C. W. Park-s.-h, Preference-based reinforcement learning: a formal framework and a policy iteration algorithm, Machine Learning, pp.123-156, 2012.

G. H. Zanuttini-b, W. P. Viappiani-p, and . Nicart-e, Model-free reinforcement learning with skew-symmetric bilinear utilities. Accepted at UAI16, 2015.

K. A. and S. K. Encelle-b, Apprentissage de connaissances d'adaptation à partir des feedbacks des utilisateurs, 25es Journées francophones d'Ingénierie des Connaissances, pp.125-136, 2014.

K. W. Stone-p, Framing reinforcement learning from human reward: Reward positivity, temporal discounting, episodicity, and performance, Artificial Intelligence, vol.225, pp.24-50, 2015.

L. R. Peng-b, M. J. Littman-m, T. M. , and H. J. Roberts-d, Learning behaviors via human-delivered discrete feedback: modeling implicit feedback strategies to speed up learning, Autonomous Agents and Multi-Agent Systems, vol.30, issue.1, pp.30-59, 2016.

O. M. Przepiórkowski-a, Linguistic Processing Chains as Web Services: Initial Linguistic Considerations, Proceedings of the Workshop on Web Services and Processing Pipelines in HLT: Tool Evaluation, LR Production and Validation (WSPP 2010) at the Language Resources and Evaluation Conference, pp.1-7, 2010.

P. S. Gupta-s and . Others, A comparative study on distance measuring approaches for clustering, International Journal of Research in Computer Science, vol.2, issue.1, pp.29-31, 2011.

R. K. Whiteson-s, V-MAX: A General Polynomial Time Algorithm for Probably Approximately Correct Reinforcement Learning, 2011.

R. F. Oliveira-n and . Barbosa-l, Towards an engine for coordination-based architectural reconfigurations, Computer Science and Information Systems, vol.12, issue.2, pp.607-634, 2015.

S. R. Barto-a, Reinforcement Learning, 1998.
DOI : 10.1016/B978-012526430-3/50003-9

T. A. Gati-i, Studies of similarity, Cognition and categorization, vol.1, pp.79-98, 1978.

V. Hage, W. R. Malaisé-v, H. L. Segers-r, and . Schreiber-g, Design and use of the Simple Event Model (SEM), Web Semantics: Science, Services and Agents on the World Wide Web, vol.9, issue.2, pp.128-136, 2011.
DOI : 10.1016/j.websem.2011.03.003

W. A. and F. A. Tadepalli-p, A bayesian approach for policy learning from trajectory preference queries, Advances in neural information processing systems, pp.1133-1141, 2012.

W. C. Fürnkranz-j, EPMC: Every visit preference monte carlo for reinforcement learning, Asian Conference on Machine Learning, ACML 2013, pp.483-497, 2013.

W. C. Neumann-g, Model free preference-based reinforcement learning, 2015.