Adaptive early classification of temporal sequences using deep reinforcement learning

Coralie Martinez; Emmanuel Ramasso; Guillaume Perrin; Michèle Rombaut

doi:10.1016/j.knosys.2019.105290

Article Dans Une Revue Knowledge-Based Systems Année : 2020

Adaptive early classification of temporal sequences using deep reinforcement learning

(1) , (2) , (1) , (3)

1
2
3

Coralie Martinez

Fonction : Auteur

Bio-Mérieux [Marcy l'Etoile]

Emmanuel Ramasso

Fonction : Auteur

Franche-Comté Électronique Mécanique, Thermique et Optique - Sciences et Technologies (UMR 6174)

Guillaume Perrin

Fonction : Auteur
PersonId : 756485
IdRef : 178652474

Bio-Mérieux [Marcy l'Etoile]

Michèle Rombaut

Fonction : Auteur
PersonId : 20146
IdHAL : michele-rombaut
ORCID : 0000-0003-4633-8501
IdRef : 10450059X

GIPSA - COntrol, PErception, Robots, navigation and Intelligent Computing

Résumé

In this article, we address the problem of early classification on temporal sequences with adaptive prediction times. We frame early prediction as a sequential decision making problem and we define a partially observable Markov decision process (POMDP) fitting the competitive objectives of classification earliness and accuracy. We solve the POMDP by training an agent for early prediction with reinforcement learning. The agent learns to make adaptive decisions between classifying incomplete sequences now or delaying its prediction to gather more data points. We adapt an existing algorithm for batch and online learning of the agent’s action value function with a deep neural network. We propose prioritized sampling, prioritized storing and a specific episode initialization to address the fact that the agent’s memory is unbalanced due to (1) : all but one of its actions terminate the process and thus (2): actions of classification are rarer than delay actions. In experiments, we compare two definitions of the POMDP based on delay reward shaping vs. reward discounting. We demonstrate that a static naive deep neural network trained to classify at static times is less efficient in terms of accuracy vs. speed than the equivalent network trained with adaptive decision making capabilities. Finally, we show improvements in accuracy induced by our specific adaptation to existing algorithm used in the online learning of the agent’s action value function.

Mots clés

Early classification Adaptive prediction time Deep reinforcement learning Temporal sequences Double DQN Trade-off between accuracy vs. speed

Domaines

Mécanique [physics]

Fichier principal

08675b78-0d09-474f-9098-3404a822a19e-author.pdf (1.35 Mo)

Origine : Fichiers produits par l'(les) auteur(s)

dma femto-st : Connectez-vous pour contacter le contributeur

https://hal.science/hal-02401099

Soumis le : mardi 24 novembre 2020-09:23:11

Dernière modification le : jeudi 4 avril 2024-18:25:21

Dates et versions

hal-02401099 , version 1 (09-12-2019)

hal-02401099 , version 2 (24-11-2020)

Licence

Paternité - Pas d'utilisation commerciale - Pas de modification

Identifiants

HAL Id : hal-02401099 , version 2
DOI : 10.1016/j.knosys.2019.105290

Citer

Coralie Martinez, Emmanuel Ramasso, Guillaume Perrin, Michèle Rombaut. Adaptive early classification of temporal sequences using deep reinforcement learning. Knowledge-Based Systems, 2020, 190, pp.105290. ⟨10.1016/j.knosys.2019.105290⟩. ⟨hal-02401099v2⟩

Exporter

BibTeX XML-TEI Dublin Core DC Terms EndNote DataCite

Collections

UGA CNRS UNIV-FCOMTE UNIV-BM FEMTO-ST GIPSA UNIV-BM-THESE GIPSA-PSD GIPSA-COPERNIC

191 Consultations

303 Téléchargements

Adaptive early classification of temporal sequences using deep reinforcement learning

Résumé

Mots clés

Domaines

Dates et versions

Licence

Identifiants

Citer

Exporter

Collections

Altmetric

Partager