Skip to Main content Skip to Navigation
Journal articles

Adaptive early classification of temporal sequences using deep reinforcement learning

Abstract : In this article, we address the problem of early classification on temporal sequences with adaptive prediction times. We frame early prediction as a sequential decision making problem and we define a partially observable Markov decision process (POMDP) fitting the competitive objectives of classification earliness and accuracy. We solve the POMDP by training an agent for early prediction with reinforcement learning. The agent learns to make adaptive decisions between classifying incomplete sequences now or delaying its prediction to gather more data points. We adapt an existing algorithm for batch and online learning of the agent's action value function with a deep neural network. We propose prioritized sampling, prioritized storing and a specific episode initial-ization to address the fact that the agent's memory is unbalanced due to (1): all but one of its actions terminate the process and thus (2): actions of classification are rarer than delay actions. In experiments, we compare two definitions of the POMDP based on delay reward shaping vs. reward discounting. We demonstrate that a static naive deep neural network trained to classify at static times is less efficient in terms of accuracy vs. speed than the equivalent network trained with adaptive decision making capabilities. Finally, we show improvements in accuracy induced by our specific adaptation to existing algorithm used in the online learning of the agent's action value function.
Document type :
Journal articles
Complete list of metadatas

Cited literature [20 references]  Display  Hide  Download
Contributor : Michèle Rombaut <>
Submitted on : Monday, December 9, 2019 - 6:07:36 PM
Last modification on : Wednesday, October 14, 2020 - 3:51:01 AM
Long-term archiving on: : Tuesday, March 10, 2020 - 9:30:33 PM


Files produced by the author(s)



Coralie Martinez, Emmanuel Ramasso, Guillaume Perrin, Michèle Rombaut. Adaptive early classification of temporal sequences using deep reinforcement learning. Knowledge-Based Systems, Elsevier, 2019, ⟨10.1016/j.knosys.2019.105290⟩. ⟨hal-02401099⟩



Record views


Files downloads