Abstract : This paper presents a parallel execution model and a many-core processor design to run C programs in parallel. The model automatically builds parallel sections of machine instructions from the run trace. It parallelizes instructions fetches, renamings, executions and retirements. Predictor based fetch is replaced by a fetch-decode-and-partly-execute stage able to compute in-order most of the control instructions. Tomasulo's register renaming is extended to memory with a technique to match consumer/producer pairs. The Reorder Buffer is adapted to allow parallel retirement. The model is presented on a sum reduction example which is also used to give a short analytical evaluation of the model performance potential.
https://hal.archives-ouvertes.fr/hal-01152664 Contributor : David ParelloConnect in order to contact the contributor Submitted on : Monday, May 18, 2015 - 1:42:39 PM Last modification on : Friday, October 22, 2021 - 3:07:35 PM Long-term archiving on: : Thursday, April 20, 2017 - 1:51:56 AM
Bernard Goossens, David Parello, Katarzyna Porada, Djallal Rahmoune. Toward a Core Design to Distribute an Execution on a Many-Core Processor. PaCT: Parallel Computing Technologies, Aug 2015, Petrozavodsk, Russia. pp.390-404, ⟨10.1007/978-3-319-21909-7_38⟩. ⟨hal-01152664⟩