Abstract : Untimed models of large embedded systems, generally written using SystemC/TLM, allow the software team to start simulations before the RTL description is available, and then provide a golden reference model to the verification team. For those two purposes, only a correct functional behavior is required, but users are asking more and more for timing estimations early in the design flow. Because companies cannot afford to maintain two simulators for the same chip, only local modifications of the untimed model are considered. A known approach is to add timing annotations into the code and to reduce the number of costly context switches using temporal decoupling, meaning that a process can go ahead of the simulation time before synchronizing again. Our current goal is to apply temporal decoupling to the TLM platform of a many-core SoC dedicated to high performance computing. Part of this SoC communicates using classic memory-mapped buses, but it can be extended with hardware accelerators communicating using FIFOs. Whereas temporal decoupling for memory-based transactions has been widely studied, FIFO-based communications raise issues that have not been addressed before. In this paper, we provide an efficient solution to combine temporal decoupling and FIFO-based communications.