]. G. Long, D. Franklin, S. Biswas, P. Ortiz, J. Oberg et al., Minimal Multi-threading: Finding and Removing Redundant Instructions in Multi-threaded Processors, 2010 43rd Annual IEEE/ACM International Symposium on Microarchitecture, pp.337-348, 2010.
DOI : 10.1109/MICRO.2010.41

M. J. Quinn, P. J. Hatcher, and K. C. Jourdenais, Compiling C* programs for a hypercube multicomputer, ACM SIGPLAN Notices, vol.23, issue.9, pp.57-65, 1988.
DOI : 10.1145/62116.62122

B. W. Coon and J. E. Lindholm, System and method for managing divergent threads in SIMD architecture, 2008.

C. Bienia, S. Kumar, J. P. Singh, and K. Li, The PARSEC benchmark suite, Proceedings of the 17th international conference on Parallel architectures and compilation techniques, PACT '08
DOI : 10.1145/1454115.1454128

F. Darema, D. A. George, V. A. Norton, and G. F. Pfister, A single-program-multiple-data computational model for EPEX/FORTRAN, Parallel Computing, vol.7, issue.1, pp.11-24, 1988.
DOI : 10.1016/0167-8191(88)90094-4

G. Diamos, A. Kerr, H. Wu, S. Yalamanchili, B. Ashbaugh et al., SIMD reconvergence at thread frontiers, MICRO, 2011.
DOI : 10.1145/2155620.2155676

Y. Lee, R. Avizienis, A. Bishara, R. Xia, D. Lockhart et al., Exploring the tradeoffs between programmability and efficiency in data-parallel accelerators, ISCA. ACM, pp.129-140, 2011.

D. M. Tullsen, S. J. Eggers, and H. M. Levy, Simultaneous multithreading, ACM SIGARCH Computer Architecture News, vol.23, issue.2, pp.392-403, 1995.
DOI : 10.1145/225830.224449

D. M. Tullsen, S. J. Eggers, J. S. Emer, H. M. Levy, J. L. Lo et al., Exploiting choice, ACM SIGARCH Computer Architecture News, vol.24, issue.2, pp.191-202, 1996.
DOI : 10.1145/232974.232993

J. González, Q. Cai, P. Chaparro, G. Magklis, R. Rakvic et al., Thread fusion, Proceeding of the thirteenth international symposium on Low power electronics and design, ISLPED '08, pp.363-368, 2008.
DOI : 10.1145/1393921.1394018

P. Barone, P. Bonizzoni, G. D. Vedova, and G. Mauri, An approximation algorithm for the shortest common supersequence problem, Proceedings of the 2001 ACM symposium on Applied computing , SAC '01, pp.56-60, 2001.
DOI : 10.1145/372202.372275

A. Borodin and R. El-yaniv, Online computation and competitive analysis, 1998.

J. L. Hennessy and D. A. Patterson, Computer Architecture: A Quantitative Approach, 2003.

J. Meng, D. Tarjan, and K. Skadron, Dynamic warp subdivision for integrated branch and memory divergence tolerance, ISCA

S. Ryoo, C. I. Rodrigues, S. S. Baghsorkhi, S. S. Stone, D. B. Kirk et al., Optimization principles and application performance evaluation of a multithreaded GPU using CUDA, Proceedings of the 13th ACM SIGPLAN Symposium on Principles and practice of parallel programming , PPoPP '08, pp.73-82, 2008.
DOI : 10.1145/1345206.1345220

Y. Yang, P. Xiang, J. Kong, and H. Zhou, A GPGPU compiler for memory optimization and parallelism management, PLDI. ACM, pp.86-97, 2010.

A. Lashgar and A. Baniasadi, Performance in GPU architectures: Potentials and distances, pp.75-81, 2011.

S. Collange, D. Defour, and Y. Zhang, Dynamic Detection of Uniform and Affine Vectors in GPGPU Computations, pp.46-55, 2009.
DOI : 10.1007/978-3-642-14122-5_8

URL : https://hal.archives-ouvertes.fr/hal-00396719