Tonmoy Dey, Kento Sato, Bogdan Nicolae, Jian Guo, Jens Domke, et al.. Optimizing Asynchronous Multi-Level Checkpoint/Restart Configurations with Machine Learning.
IPDPSW'20: The 2020 IEEE International Parallel and Distributed Processing Symposium Workshops, May 2020, New Orleans, United States. pp.1036-1043,
⟨10.1109/IPDPSW50202.2020.00174⟩.
⟨hal-02914478⟩