Simulation of the Portals 4 protocol, and case study on the BXI interconnect

Julien Emmanuel 1, 2 Matthieu Moy 2 Ludovic Henrio 2 Grégoire Pichon 1
2 CASH - CASH - Compilation and Analysis, Software and Hardware
Inria Grenoble - Rhône-Alpes, LIP - Laboratoire de l'Informatique du Parallélisme
Abstract : We present a new network simulator, which models the Portals 4 communication protocol used in High Performance Computing (HPC). It is built on top of SimGrid and uses cooperative actors to model the interactions between compute nodes in a supercomputer. Unlike most simulators in HPC, it models both communications on the interconnect and on the PCIe network inside each compute node, whithout going for a full emulation of the hardware. The simulator can be used to optimize or debug an application without having to use an actual supercomputer. This is made possible by leveraging SimGrid's flow model and it enables accurate simulation with good performances, even when running the model on a laptop. We test this simulator with custom experiments as well as existing Portals code, and compare the results with Portals executions on an actual cluster using Atos' BXI interconnect.
Conference papers
Julien Emmanuel, Matthieu Moy, Ludovic Henrio, Grégoire Pichon. Simulation of the Portals 4 protocol, and case study on the BXI interconnect. HPCS 2020 - International Conference on High Performance Computing & Simulation, Dec 2020, Barcelona, Spain. pp.1-8. ⟨hal-02972297⟩



