Skip to Main content Skip to Navigation
Conference papers

Measuring predictability of Nvidia’s GPU warp and block schedulers: Application to the summation problem

David Defour 1
1 DALI - Digits, Architectures et Logiciels Informatiques
LIRMM - Laboratoire d'Informatique de Robotique et de Microélectronique de Montpellier, UPVD - Université de Perpignan Via Domitia
Abstract : GPU's are massively multicore architectures managing several thousands of concurrent threads. This concurrency, maintained through several schedulers, is necessary to keep high performance but negatively impact predictability. The lack of predictability is not a problem for most of data parallel applications written in CUDA and therefore hasn't been widely studied. However for some others, such as the summation of floating-point numbers, this may be problematic as it can lead to deadlock situation. In this work, we first propose measures of predictability as well as CUDA tests to estimate this measure regarding warp and block scheduler for architectures from G80 to GK104. Then, we evaluate how to impact this measure and apply those results to the atomic addition of floating-point numbers and show how to make this operation predictable.
Complete list of metadatas

https://hal.archives-ouvertes.fr/hal-01267747
Contributor : David Defour <>
Submitted on : Thursday, February 4, 2016 - 8:13:08 PM
Last modification on : Wednesday, June 26, 2019 - 5:13:26 PM

Identifiers

Collections

Citation

David Defour. Measuring predictability of Nvidia’s GPU warp and block schedulers: Application to the summation problem. MCSoC: Embedded Multicore/Many-core Systems-on-Chip, Sep 2015, Turin, Italy. pp.17-24, ⟨10.1109/MCSoC.2015.9⟩. ⟨hal-01267747⟩

Share

Metrics

Record views

163