Skip to Main content Skip to Navigation
Journal articles

DITVA: Dynamic Inter-Thread Vectorization Architecture

Abstract : In the Single-Program Multiple-Data (SPMD) programming model, threads of an application exhibit very similar control flows and often execute the same instructions, but on different data. In this paper, we propose the Dynamic Inter-thread Vectorization Architecture (DITVA) to leverage the implicit Data Level Parallelism that exists across threads on SPMD applications. By assembling dynamic vector instructions at runtime, DITVA extends an in-order SMT processor with a dynamic inter-thread vector execution mode akin to the Single-Instruction, Multiple-Thread model of Graphics Processing Units. In this mode, multiple scalar threads running in lockstep share a single instruction stream and their respective instruction instances are aggregated into SIMD instructions. DITVA can leverage existing SIMD units and maintains binary compatibility with existing CPU architec-tures. To balance thread-and data-level parallelism, threads are statically grouped into fixed-size independently scheduled warps. Additionally, to maximize dynamic vector-ization opportunities, we adapt the fetch steering policy to favor thread synchronization within warps and thus improve lockstep execution. Our experimental evaluation of the DITVA architecture on the SPMD applications from the PARSEC and Rodinia OpenMP benchmarks show that a 4-warp × 4-lane 4-issue DITVA architecture with a realistic bank-interleaved cache achieves 1.55× higher performance compared to a 4-thread 4-issue SMT architecture with AVX instructions , while fetching and issuing 51% fewer instructions, and achieving an overall 24% energy reduction. DITVA also enables applications limited by memory to scale with higher bandwidth architectures. For instance, when the bandwidth is increased from 2GB/s to 16GB/s, we find that memory bound applications show an improvement in performance by 3× in comparison with the baseline SMT. Therefore, DITVA appears as a cost-effective design for achieving very high single-core performance on SPMD parallel sections.
Document type :
Journal articles
Complete list of metadata

Cited literature [39 references]  Display  Hide  Download
Contributor : Caroline Collange <>
Submitted on : Tuesday, December 5, 2017 - 11:44:21 AM
Last modification on : Friday, July 10, 2020 - 4:15:32 PM


Files produced by the author(s)



Sajith Kalathingal, Sylvain Collange, Bharath Swamy, André Seznec. DITVA: Dynamic Inter-Thread Vectorization Architecture. Journal of Parallel and Distributed Computing, Elsevier, 2018, pp.1-32. ⟨10.1016/j.jpdc.2017.11.006⟩. ⟨hal-01655904⟩



Record views


Files downloads