| HAL : hal-00624832, version 1 |
| Fiche détaillée | Récupérer au format |
|
|
| ICONIP 2011, Chine (2011) |
|
|
|
|
| Q-Learning with Double Progressive Widening : Application to Robotics |
|
|
| Nataliya Sokolovska 1, 2Olivier Teytaud 1, 2 |
|
|
| (19/09/2011) |
|
|
| Discretization of state and action spaces is a critical issue in $Q$-Learning. In our contribution, we propose a real-time adaptation of the discretization by the progressive widening technique which has been already used in bandit-based methods. Results are consistently converging to the optimum of the problem, without changing the parametrization for each new problem. |
|
|
|
|
|
|
|
|
|
|
| 1 : | Laboratoire de Recherche en Informatique (LRI) |
| CNRS : UMR8623 – Université Paris XI - Paris Sud | |
| 2 : | TAO (INRIA Saclay - Ile de France) |
| INRIA – CNRS : UMR8623 – Université Paris XI - Paris Sud | |
|
|
|
|
|
|
|
|
| Domaine | : | Informatique/Apprentissage |
|
|
| Liste des fichiers attachés à ce document : | |||||
|
|
|
| hal-00624832, version 1 | |
| http://hal.archives-ouvertes.fr/hal-00624832 | |
| oai:hal.archives-ouvertes.fr:hal-00624832 | |
| Contributeur : Nataliya Sokolovska | |
| Soumis le : Mardi 20 Septembre 2011, 03:23:55 | |
| Dernière modification le : Mardi 20 Septembre 2011, 10:23:19 | |