Preference-based policy learning, Machine Learning and Knowledge Discovery in Databases, pp.12-27, 2011. ,
APRIL: Active preference learning-based reinforcement learning, Machine Learning and Knowledge Discovery in Databases, pp.116-131, 2012. ,
WebLab PROV: Computing fine-grained provenance links for XML artifacts, BIGProv'13 Workshop (in conjunction with EDBT/ICDT), pp.298-306, 2013. ,
Strategic advice provision in repeated human-agent interactions, p.1500, 2012. ,
R-max-a general polynomial time algorithm for nearoptimal reinforcement learning, The Journal of Machine Learning Research, vol.3, pp.213-231, 2003. ,
Learning qualitative models BUSA Preference-based reinforcement learning: evolutionary direct policy search using a preference-based racing algorithm, Machine Learning, pp.97-327, 2003. ,
Provenance-based quality assessment and inference in data-centric workflow executions, On the Move to Meaningful Internet Systems: OTM 2014 Conferences, pp.130-147, 2014. ,
Very Fast Similarity Queries on Semi-Structured Data from the Web, SDM, pp.512-520, 2013. ,
DOI : 10.1137/1.9781611972832.57
Developing Language Processing Components with GATE Version 8 (a User Guide). https, pp.2014-2026, 2014. ,
Knowledge-based highlyspecialized terrorist event extraction. RuleML2013 Challenge, Human Language Technology and Doctoral Consortium, p.1, 2013. ,
Leveraging online user feedback to improve statistical machine translation, Journal of Artificial Intelligence Research, vol.54, pp.159-192, 2015. ,
Model-based computing for design and control of reconfigurable systems, AI magazine, vol.24, issue.120, 2003. ,
Preference-based reinforcement learning: a formal framework and a policy iteration algorithm, Machine Learning, pp.123-156, 2012. ,
Model-free reinforcement learning with skew-symmetric bilinear utilities. Accepted at UAI16, 2015. ,
Apprentissage de connaissances d'adaptation à partir des feedbacks des utilisateurs, 25es Journées francophones d'Ingénierie des Connaissances, pp.125-136, 2014. ,
Framing reinforcement learning from human reward: Reward positivity, temporal discounting, episodicity, and performance, Artificial Intelligence, vol.225, pp.24-50, 2015. ,
Learning behaviors via human-delivered discrete feedback: modeling implicit feedback strategies to speed up learning, Autonomous Agents and Multi-Agent Systems, vol.30, issue.1, pp.30-59, 2016. ,
Linguistic Processing Chains as Web Services: Initial Linguistic Considerations, Proceedings of the Workshop on Web Services and Processing Pipelines in HLT: Tool Evaluation, LR Production and Validation (WSPP 2010) at the Language Resources and Evaluation Conference, pp.1-7, 2010. ,
A comparative study on distance measuring approaches for clustering, International Journal of Research in Computer Science, vol.2, issue.1, pp.29-31, 2011. ,
V-MAX: A General Polynomial Time Algorithm for Probably Approximately Correct Reinforcement Learning, 2011. ,
Towards an engine for coordination-based architectural reconfigurations, Computer Science and Information Systems, vol.12, issue.2, pp.607-634, 2015. ,
Reinforcement Learning, 1998. ,
DOI : 10.1016/B978-012526430-3/50003-9
Studies of similarity, Cognition and categorization, vol.1, pp.79-98, 1978. ,
Design and use of the Simple Event Model (SEM), Web Semantics: Science, Services and Agents on the World Wide Web, vol.9, issue.2, pp.128-136, 2011. ,
DOI : 10.1016/j.websem.2011.03.003
A bayesian approach for policy learning from trajectory preference queries, Advances in neural information processing systems, pp.1133-1141, 2012. ,
EPMC: Every visit preference monte carlo for reinforcement learning, Asian Conference on Machine Learning, ACML 2013, pp.483-497, 2013. ,
Model free preference-based reinforcement learning, 2015. ,