Divergence Analysis with Affine Constraints

Diogo Sampaio 1 Rafael Martins 1 Sylvain Collange 2, * Fernando Magno Quintão Pereira 1
* Corresponding author
1 Laboratório de Linguagens de Programação
DCC - UFMG - Departamento de Ciência da Computação [Minas Gerais]
2 ALF - Amdahl's Law is Forever
Inria Rennes – Bretagne Atlantique , IRISA-D3 - ARCHITECTURE
Abstract : The rising popularity of graphics processing units is bringing renewed interest in code optimization techniques for SIMD processors. Many of these optimizations rely on divergence analyses, which classify variables as uniform, if they have the same value on every thread, or divergent, if they might not. This paper introduces a new kind of divergence analysis, that is able to represent variables as affine functions of thread identifiers. We have implemented this analysis in Ocelot, an open source compiler, and use it to analyze a suite of 177 CUDA kernels from well-known benchmarks. We can mark about one fourth of all program variables as affine functions of thread identifiers. In addition to the novel divergence analysis, we also introduce the notion of a divergence aware register allocator. This allocator uses information from our analysis to either rematerialize affine variables, or to move uniform variables to shared memory. As a testimony of its effectiveness, our divergence aware allocator produces GPU code that is 29.70% faster than the code produced by Ocelot's register allocator. Divergence analysis with affine constraints is publicly available in the Ocelot compiler since June/2012.
Liste complète des métadonnées

Cited literature [17 references]  Display  Hide  Download

Contributor : Sylvain Collange <>
Submitted on : Tuesday, November 20, 2012 - 6:44:58 PM
Last modification on : Wednesday, April 17, 2019 - 7:22:02 PM
Document(s) archivé(s) le : Thursday, February 21, 2013 - 12:32:07 PM


Files produced by the author(s)



Diogo Sampaio, Rafael Martins, Sylvain Collange, Fernando Magno Quintão Pereira. Divergence Analysis with Affine Constraints. 24th International Symposium on Computer Architecture and High Performance Computing, Oct 2012, New-York, NY, United States. pp.67-74, ⟨10.1109/SBAC-PAD.2012.22⟩. ⟨hal-00650235v2⟩



Record views


Files downloads