F. Agakov, E. Bonilla, J. Cavazos, B. Franke, G. Fursin et al., Using Machine Learning to Focus Iterative Optimization, International Symposium on Code Generation and Optimization (CGO'06), pp.295-305, 2006.
DOI : 10.1109/CGO.2006.37
URL : http://citeseerx.ist.psu.edu/viewdoc/summary?doi=10.1.1.112.2976

J. Ansel, C. Chan, Y. L. Wong, M. Olszewski, Q. Zhao et al., Petabricks: a language and compiler for algorithmic choice, PLDI '09, pp.38-49, 2009.

A. W. Appel and M. Ginsburg, Modern Compiler Implementation in C, 2004.
DOI : 10.1017/CBO9781139174930

M. Arnold, S. J. Fink, D. Grove, M. Hind, and P. F. Sweeney, A Survey of Adaptive Optimization in Virtual Machines, Proceedings of the IEEE, vol.93, issue.2, pp.449-466, 2005.
DOI : 10.1109/JPROC.2004.840305

V. Aslot, M. J. Domeika, R. Eigenmann, G. Gaertner, W. B. Jones et al., SPEComp: A New Benchmark Suite for Measuring Parallel Computer Performance, WOMPAT '01, pp.1-10, 2001.
DOI : 10.1007/3-540-44587-0_1

H. Bae, L. Bachega, C. Dave, S. Lee, S. Lee et al., Cetus: A source-to-source compiler infrastructure for multicores, Proc. of the 14th Int'l Workshop on Compilers for Parallel Computing, 2009.

R. Baghdadi, A. Cohen, C. Bastoul, L. Pouchet, and L. Rauchwerger, The Potential of Synergistic Static, Dynamic and Speculative Loop Nest Optimizations for Automatic Parallelization, Pespma 2010 -Workshop on Parallel Execution of Sequential Programs on Multi-core Architecture, 2010.
URL : https://hal.archives-ouvertes.fr/inria-00494305

V. Bala, E. Duesterwald, and S. Banerjia, Dynamo: a transparent dynamic optimization system, Proceedings of the ACM SIGPLAN 2000 conference on Programming language design and implementation, PLDI '00, pp.1-12, 2000.

K. Utpal and . Banerjee, Dependence Analysis for Supercomputing, 1988.

A. Muthu-manikandan-baskaran, S. Hartono, T. Tavarageri, J. Henretty, P. Ramanujam et al., Parameterized tiling revisited, Proceedings of the International Symposium on Code Generation and Optimization (CGO), pp.200-209, 2010.

C. Bastoul, A. Cohen, S. Girbal, S. Sharma, and O. Temam, Putting Polyhedral Loop Transformations to Work, LCPC'16 Intl. Workshop on Languages and Compilers for Parallel Computers, pp.209-225, 2003.
DOI : 10.1007/978-3-540-24644-2_14
URL : https://hal.archives-ouvertes.fr/inria-00071681

C. Bastoul, Code generation in the polyhedral model is easier than you think, Proceedings. 13th International Conference on Parallel Architecture and Compilation Techniques, 2004. PACT 2004., pp.7-16, 2004.
DOI : 10.1109/PACT.2004.1342537
URL : https://hal.archives-ouvertes.fr/hal-00017260

C. Bastoul, Improving Data Locality in Static Control Programs, 2004.

M. Benabderrahmane, L. Pouchet, A. Cohen, and C. Bastoul, The Polyhedral Model Is More Widely Applicable Than You Think, ETAPS CC, 2010.
DOI : 10.1007/978-3-642-11970-5_16
URL : https://hal.archives-ouvertes.fr/inria-00551087

C. Jean and . Beyler, Dynamic Software Optimization of Memory Accesses, 2007.

J. C. Beyler and P. Clauss, Performance driven data cache prefetching in a dynamic software optimization system, Proceedings of the 21st annual international conference on Supercomputing, ICS '07, pp.202-209, 2007.
DOI : 10.1145/1274971.1275000
URL : https://hal.archives-ouvertes.fr/inria-00504614

L. S. Blackford, J. Demmel, J. Dongarra, I. Duff, S. Hammarling et al., An updated set of basic linear algebra subprograms (BLAS), ACM Transactions on Mathematical Software, vol.28, issue.2, pp.135-151, 2001.
DOI : 10.1145/567806.567807

W. Blume and R. Eigenmann, The range test, Proceedings of the 1994 ACM/IEEE conference on Supercomputing , Supercomputing '94, pp.528-537, 1994.
DOI : 10.1145/602770.602858

R. D. Blumofe, C. F. Joerg, B. C. Kuszmaul, C. E. Leiserson, K. H. Randall et al., Cilk: An efficient multithreaded runtime system, In Journal of Parallel and Distributed Computing, pp.207-216, 1995.
DOI : 10.1006/jpdc.1996.0107
URL : http://citeseerx.ist.psu.edu/viewdoc/summary?doi=10.1.1.18.3175

D. Robert, C. E. Blumofe, and . Leiserson, Scheduling multithreaded computations by work stealing, J. ACM, vol.46, issue.5, 1999.

F. Bodin, T. Kisuki, P. Knijnenburg, M. O. Boyle, and E. Rohou, Iterative compilation in a non-linear optimisation space, Workshop on Profile and Feedback-Directed Compilation, 1998.
URL : https://hal.archives-ouvertes.fr/inria-00475919

U. Bondhugula, M. Baskaran, S. Krishnamoorthy, J. Ramanujam, A. Rountev et al., Automatic Transformations for Communication-Minimized Parallelization and Locality Optimization in the Polyhedral Model, Proceedings of the Joint European Conferences on Theory and Practice of Software, 17th international conference on Compiler construction, CC'08/ETAPS'08, pp.132-146, 2008.
DOI : 10.1007/978-3-540-78791-4_9

U. Bondhugula, A. Hartono, J. Ramanujam, and P. Sadayappan, A practical automatic polyhedral parallelizer and locality optimizer, PLDI '08, pp.101-113, 2008.

D. Bruening, S. Devabhaktuni, and S. Amarasinghe, Softspec: Software-based speculative parallelism, ACM Workshop on Feedback-Directed and Dynamic Optimization, 2000.

D. L. Bruening, Efficient, transparent, and comprehensive runtime code manipulation, p.807735, 2004.

P. Calland, A. Darte, Y. Robert, and F. Vivien, On the removal of anti and output dependences, Proceedings of International Conference on Application Specific Systems, Architectures and Processors: ASAP '96, pp.285-312, 1998.
DOI : 10.1109/ASAP.1996.542829
URL : https://hal.archives-ouvertes.fr/inria-00073890

C. Chen, J. Chame, and M. Hall, Chill: A framework for composing high-level loop transformations, 2008.

D. Chen, J. Torrellas, and P. Yew, An efficient algorithm for the run-time parallelization of DOACROSS loops, Proceedings of the 1994 ACM/IEEE conference on Supercomputing , Supercomputing '94, pp.518-527, 1994.
DOI : 10.1145/602770.602857

M. Peng-sheng-chen, Y. Hung, R. Hwang, J. Ju, and . Lee, Compiler support for speculative multithreading architecture with probabilistic points-to analysis, Proceedings of the ninth ACM SIGPLAN symposium on Principles and practice of parallel programming, PPoPP '03, pp.25-36, 2003.

M. Cintra and D. R. Llanos, Toward efficient and robust software speculative parallelization on multiprocessors, Proceedings of the ninth ACM SIG- PLAN symposium on Principles and practice of parallel programming, PPoPP '03, pp.13-24, 2003.
DOI : 10.1145/781498.781501

. Ph, Clauss and I. Tchoupaeva. A Symbolic Approach to Bernstein Expansion for Program Analysis and Optimization, CC, 2004.

P. Clauss, Counting solutions to linear and nonlinear constraints through ehrhart polynomials: applications to analyze and transform scientific programs, Proceedings of the 10th international conference on Supercomputing, ICS '96, pp.278-285, 1996.
URL : https://hal.archives-ouvertes.fr/hal-01100306

P. Clauss, F. J. Fernández, D. Garbervetsky, and S. Verdoolaege, Symbolic Polynomial Maximization Over Convex Sets and Its Application to Memory Requirement Estimation, IEEE Transactions on Very Large Scale Integration (VLSI) Systems, vol.17, issue.8, 2009.
DOI : 10.1109/TVLSI.2008.2002049
URL : https://hal.archives-ouvertes.fr/inria-00504617

A. Cohen, S. Girbal, and O. Temam, A Polyhedral Approach to Ease the Composition of Program Transformations, Euro-Par'04, no. 3149 in LNCS, pp.292-303, 2004.
DOI : 10.1007/978-3-540-27866-5_38
URL : https://hal.archives-ouvertes.fr/hal-01257301

R. S. Cohn, D. W. Goodwin, and P. G. Lowney, Optimizing alpha executables on windows nt with spike, Digital Tech. J, vol.9, pp.3-20, 1998.

J. Collard, Automatic parallelization ofwhile-loops using speculative execution, International Journal of Parallel Programming, vol.634, issue.1, pp.191-219, 1995.
DOI : 10.1007/BF02577789

J. Collard, D. Barthou, and P. Feautrier, Fuzzy array dataflow analysis, ACM SIGPLAN Notices, vol.30, issue.8, pp.92-101, 1995.
DOI : 10.1145/209937.209947
URL : http://citeseerx.ist.psu.edu/viewdoc/summary?doi=10.1.1.29.6305

K. D. Cooper, D. Subramanian, and L. Torczon, Adaptive optimizing compilers for the 21st century, Journal of Supercomputing, vol.23, 2001.

R. Cytron, J. Ferrante, B. K. Rosen, M. N. Wegman, and F. K. Zadeck, Efficiently computing static single assignment form and the control dependence graph, ACM Transactions on Programming Languages and Systems, vol.13, issue.4, 1991.
DOI : 10.1145/115372.115320

M. Devuyst, D. M. Tullsen, and S. Kim, Runtime parallelization of legacy code on a transactional memory system, Proceedings of the 6th International Conference on High Performance and Embedded Architectures and Compilers, HiPEAC '11, pp.127-136, 2011.
DOI : 10.1145/1944862.1944882

C. Ding and K. Kennedy, Improving cache performance in dynamic applications through data and computation reorganization at run time, Proceedings of the SIGPLAN '99 Conference on Programming Language Design and Implementation, pp.229-241, 1999.

L. Djoudi, D. Barthou, P. Carribault, C. Lemuet, J. Acquaviva et al., Maqao: Modular assembler quality analyzer and optimizer for itanium 2, Workshop on Explicitly Parallel Instruction Computing Techniques, 2005.
URL : https://hal.archives-ouvertes.fr/hal-00141075

K. Ebcio?lu and E. R. Altman, DAISY, ACM SIGARCH Computer Architecture News, vol.25, issue.2, pp.26-37, 1997.
DOI : 10.1145/384286.264126

A. Edwards, H. Vo, and A. Srivastava, Vulcan binary transformation in a distributed environment, 2001.

P. Feautrier, Parametric integer programming, RAIRO - Operations Research, vol.22, issue.3, pp.243-268, 1988.
DOI : 10.1051/ro/1988220302431

P. Feautrier, Dataflow analysis of array and scalar references, International Journal of Parallel Programming, vol.24, issue.4, pp.23-53, 1991.
DOI : 10.1007/BF01407931

P. Feautrier, Array expansion, In In ACM Int. Conf. on Supercomputing, pp.429-441, 1988.
DOI : 10.1145/2591635.2667159
URL : https://hal.archives-ouvertes.fr/hal-01099746

P. Feautrier, Some efficient solutions to the affine scheduling problem. I. One-dimensional time, International Journal of Parallel Programming, vol.40, issue.6, pp.313-348, 1992.
DOI : 10.1007/BF01407835

P. Feautrier, Some efficient solutions to the affine scheduling problem. Part II. Multidimensional time, International Journal of Parallel Programming, vol.2, issue.4, 1992.
DOI : 10.1007/BF01379404

P. Feautrier, Automatic parallelization in the polytope model Université de Versailles St-Quentin en Yvelines, 45, avenue des États- Unis, F-78035 Versailles Cedex, Laboratoire PRiSM, pp.79-103, 1996.

G. Fursin, A. Cohen, M. O. Boyle, and O. Temam, A Practical Method for Quickly Evaluating Program Optimizations, Proceedings of the International Conference on High Performance Embedded Architectures & Compilers, pp.29-46, 2005.
DOI : 10.1007/11587514_4
URL : https://hal.archives-ouvertes.fr/inria-00001054

G. Fursin, Y. Kashnikov, A. Memon, Z. Chamski, O. Temam et al., Milepost GCC: Machine Learning Enabled Self-tuning Compiler, International Journal of Parallel Programming, vol.16, issue.2???3, pp.296-327, 2011.
DOI : 10.1007/s10766-010-0161-2
URL : https://hal.archives-ouvertes.fr/hal-00685276

J. Gosling, B. Joy, and G. L. Steele, The Java Language Specification, 1996.

G. Goumas, M. Athanasaki, and N. Koziris, An efficient code generation technique for tiled iteration spaces, IEEE Transactions on Parallel and Distributed Systems, vol.14, issue.10, p.1034, 2003.
DOI : 10.1109/TPDS.2003.1239870

T. Grosser, Enabling Polyhedral Optimizations in LLVM, 2011.

A. Größlinger, M. Griebl, and C. Lengauer, Introducing nonlinear parameters to the polyhedron model, Proc. 11th Workshop on Compilers for Parallel Computers Research Report Series, pp.1-12, 2004.

G. Gupta and S. Rajopadhye, The Z-polyhedral model, Proceedings of the 12th ACM SIGPLAN symposium on Principles and practice of parallel programming , PPoPP '07, pp.237-248, 2007.
DOI : 10.1145/1229428.1229478

A. Hartono, M. M. Baskaran, J. Ramanujam, and P. Sadayappan, DynTile: Parametric tiled loop generation for parallel execution on multicore processors, 2010 IEEE International Symposium on Parallel & Distributed Processing (IPDPS), pp.1-12, 2010.
DOI : 10.1109/IPDPS.2010.5470459

A. Hartono, C. Muthu-manikandan-baskaran, A. Bastoul, S. Cohen, B. Krishnamoorthy et al., Parametric multi-level tiling of imperfectly nested loops, Proceedings of the 23rd international conference on Conference on Supercomputing, ICS '09, pp.147-157, 2009.
DOI : 10.1145/1542275.1542301
URL : https://hal.archives-ouvertes.fr/hal-00645328

B. Hertzberg and K. Olukotun, Runtime automatic speculative parallelization . Code Generation and Optimization, IEEE/ACM International Symposium on, issue.0, pp.64-73, 2011.
DOI : 10.1109/cgo.2011.5764675

M. Hind, Pointer analysis, Proceedings of the 2001 ACM SIGPLAN-SIGSOFT workshop on Program analysis for software tools and engineering , PASTE '01, pp.54-61, 2001.
DOI : 10.1145/379605.379665

J. Hollingsworth, B. P. Miller, and J. Cargille, Dynamic program instrumentation for scalable performance tools, Proceedings of IEEE Scalable High Performance Computing Conference, pp.841-850, 1994.
DOI : 10.1109/SHPCC.1994.296728

T. A. Johnson, R. Eigenmann, and T. N. Vijaykumar, Speculative thread decomposition through empirical optimization, Proceedings of the 12th ACM SIGPLAN symposium on Principles and practice of parallel programming , PPoPP '07, pp.205-214, 2007.
DOI : 10.1145/1229428.1229474

R. M. Karp, R. E. Miller, and S. Winograd, The Organization of Computations for Uniform Recurrence Equations, Journal of the ACM, vol.14, issue.3, pp.563-590, 1967.
DOI : 10.1145/321406.321418

W. Kelly, W. Pugh, and E. Rosser, Code generation for multiple mappings, Proceedings Frontiers '95. The Fifth Symposium on the Frontiers of Massively Parallel Computation, pp.332-341, 1994.
DOI : 10.1109/FMPC.1995.380437
URL : http://citeseerx.ist.psu.edu/viewdoc/summary?doi=10.1.1.23.8696

A. Ketterlin and P. Clauss, Recovering the Memory Behavior of Executable Programs, 2010 10th IEEE Working Conference on Source Code Analysis and Manipulation, 2010.
DOI : 10.1109/SCAM.2010.18
URL : https://hal.archives-ouvertes.fr/inria-00502813

D. Kim and S. V. Rajopadhye, Parameterized tiling for imperfectly nested loops, 2009.

D. Kim, L. Renganarayanan, D. Rostron, S. Rajopadhye, and M. M. Strout, Multi-level tiling, Proceedings of the 2007 ACM/IEEE conference on Supercomputing , SC '07, pp.1-51, 2007.
DOI : 10.1145/1362622.1362691

M. Kim, H. Kim, and C. Luk, SD3: A Scalable Approach to Dynamic Data-Dependence Profiling, 2010 43rd Annual IEEE/ACM International Symposium on Microarchitecture, pp.535-546, 2010.
DOI : 10.1109/MICRO.2010.49

T. Kistler and M. Franz, Continuous program optimization: Design and evaluation, IEEE Transactions on Computers, vol.50, issue.6, pp.549-566, 2001.
DOI : 10.1109/12.931893

T. Kisuki, P. M. Knijnenburg, M. F. O-'boyle, and H. A. Wijshoff, Iterative compilation in program optimization, Computing Systems, 2000.

X. Kong, D. Klappholz, and K. Psarris, The I test: an improved dependence test for automatic parallelization and vectorization, IEEE Transactions on Parallel and Distributed Systems, vol.2, issue.3, 1991.
DOI : 10.1109/71.86109

A. Kotha, K. Anand, M. Smithson, G. Yellareddy, and R. Barua, Automatic Parallelization in a Binary Rewriter, 2010 43rd Annual IEEE/ACM International Symposium on Microarchitecture, p.43, 2010.
DOI : 10.1109/MICRO.2010.27

M. A. Laurenzano, M. M. Tikir, L. Carrington, and A. Snavely, PEBIL: Efficient static binary instrumentation for Linux, 2010 IEEE International Symposium on Performance Analysis of Systems & Software (ISPASS), 2010.
DOI : 10.1109/ISPASS.2010.5452024
URL : http://citeseerx.ist.psu.edu/viewdoc/summary?doi=10.1.1.170.2621

C. Lengauer and M. Griebl, On the parallelization of loop nests containing while loops, Proceedings the First Aizu International Symposium on Parallel Algorithms/Architecture Synthesis, pp.10-18, 1995.
DOI : 10.1109/AISPAS.1995.401360

. Shun-tak, J. Leung, and . Zahorjan, Improving the performance of runtime parallelization, Proceedings of the fourth ACM SIGPLAN symposium on Principles and practice of parallel programming, PPOPP '93, pp.83-91, 1993.

A. W. Lim and M. S. Lam, Maximizing parallelism and minimizing synchronization with affine partitions, Parallel Computing, pp.201-214, 1998.
DOI : 10.1016/S0167-8191(98)00021-0
URL : http://citeseerx.ist.psu.edu/viewdoc/summary?doi=10.1.1.24.7731

W. Liu, J. Tuck, L. Ceze, W. Ahn, K. Strauss et al., POSH, Proceedings of the eleventh ACM SIGPLAN symposium on Principles and practice of parallel programming , PPoPP '06, pp.158-167, 2006.
DOI : 10.1145/1122971.1122997

S. Long and G. Fursin, A heuristic search algorithm based on unified transformation framework, ICPPW '05: Proceedings of the 2005 International Conference on Parallel Processing Workshops (ICPPW'05), pp.137-144, 2005.

J. Lu, H. Chen, P. Yew, and W. Hsu, Design and implementation of a lightweight dynamic optimization system, Journal of Instruction- Level Parallelism, vol.6, 2004.

C. Luk, R. Cohn, R. Muth, H. Patil, A. Klauser et al., Pin: Building customized program analysis tools with dynamic instrumentation, Programming Language Design and Implementation, pp.190-200, 2005.

J. Mars and R. Hundt, Scenario Based Optimization: A Framework for Statically Enabling Online Optimizations, 2009 International Symposium on Code Generation and Optimization, pp.169-179, 2009.
DOI : 10.1109/CGO.2009.24

E. Meijer and J. Gough, Technical overview of the common language runtime, 2000.

J. Mellor-crummey, R. J. Fowler, G. Marin, and N. Tallent, Hpcview: A tool for top-down analysis of node performance, The Journal of Supercomputing, vol.23, issue.1, pp.81-104, 2002.
DOI : 10.1023/A:1015789220266

M. Philippsen, N. Tillmann, and D. Brinkers, Double Inspection for Run-Time Loop Parallelization, Proceedings of the 24th International Workshop on Languages and Compilers for Parallel Computing, 2011.
DOI : 10.1007/978-3-642-36036-7_4

P. Barton, M. D. Miller, J. M. Callaghan, J. K. Cargille, R. Hollingsworth et al., The paradyn parallel performance measurement tools, IEEE Computer, vol.28, pp.37-46, 1995.

R. Mirchandaney, J. H. Saltz, and D. Baxter, Run-time parallelization and scheduling of loops, IEEE Transactions on Computers, vol.40, 1991.

N. Mitchell, L. Carter, and J. Ferrante, Localizing non-affine array references, 1999 International Conference on Parallel Architectures and Compilation Techniques (Cat. No.PR00425), 1999.
DOI : 10.1109/PACT.1999.807526
URL : http://citeseerx.ist.psu.edu/viewdoc/summary?doi=10.1.1.44.1925

S. Steven and . Muchnick, Advanced Compiler Design and Implementation, 1997.

R. Muth, S. Debray, S. Watterson, and K. De-bosschere, alto: A link-time optimizer for the compaq alpha. Software -Practice and Experience, pp.67-101, 1999.

N. Nethercote and J. Seward, Valgrind: a framework for heavyweight dynamic binary instrumentation, Proceedings of the 2007 ACM SIGPLAN conference on Programming language design and implementation, PLDI '07, pp.89-100, 2007.

A. Nisbet, GAPS: A compiler framework for genetic algorithm (GA) optimised parallelisation, HPCN Europe, pp.987-989, 1998.
DOI : 10.1007/BFb0037253

K. Ootsu, T. Yokota, T. Ono, and T. Baba, Preliminary evaluation of a binary translation system for multithreaded processors, International Workshop on Innovative Architecture for Future Generation High-Performance Processors and Systems, pp.77-84, 2002.
DOI : 10.1109/IWIA.2002.1035021

E. Park, L. Pouchet, J. Cavazos, A. Cohen, and P. Sadayappan, Predictive modeling in a polyhedral optimization space, 9th IEEE/ACM International Symposium on Code Generation and Optimization (CGO'11), 2011.
URL : https://hal.archives-ouvertes.fr/inria-00551076

M. Paul, D. A. Petersen, and . Padua, Static and dynamic evaluation of data dependence analysis, Proceedings of the 7th international conference on Supercomputing , ICS '93, pp.107-116, 1993.

R. Ponnusamy, J. Saltz, and A. Choudhary, Runtime compilation techniques for data partitioning and communication schedule reuse, Proceedings of the 1993 ACM/IEEE conference on Supercomputing , Supercomputing '93, pp.361-370, 1993.
DOI : 10.1145/169627.169752
URL : http://bmi.osu.edu/resources/publications/105.pdf

L. Pouchet, Iterative Optimization in the Polyhedral Model, 2010.
URL : https://hal.archives-ouvertes.fr/inria-00419974

L. Pouchet, C. Bastoul, A. Cohen, and J. Cavazos, Iterative optimization in the polyhedral model: Part II, multidimensional time, PLDI'08, pp.90-100, 2008.
URL : https://hal.archives-ouvertes.fr/hal-01257273

L. Pouchet, C. Bastoul, A. Cohen, and N. Vasilache, Iterative Optimization in the Polyhedral Model: Part I, One-Dimensional Time, International Symposium on Code Generation and Optimization (CGO'07), pp.144-156, 2007.
DOI : 10.1109/CGO.2007.21
URL : https://hal.archives-ouvertes.fr/hal-01257281

L. Pouchet, U. Bondhugula, C. Bastoul, A. Cohen, J. Ramanujam et al., Combined Iterative and Model-driven Optimization in an Automatic Parallelization Framework, 2010 ACM/IEEE International Conference for High Performance Computing, Networking, Storage and Analysis, 2010.
DOI : 10.1109/SC.2010.14
URL : https://hal.archives-ouvertes.fr/inria-00551067

W. Pugh, The Omega test: a fast and practical integer programming algorithm for dependence analysis, Proceedings of the 1991 ACM/IEEE conference on Supercomputing , Supercomputing '91, pp.4-13, 1991.
DOI : 10.1145/125826.125848

C. García-quiñones, C. Madriles, J. Sánchez, P. Marcuello, A. González et al., Mitosis compiler: an infrastructure for speculative threading based on pre-computation slices, Proceedings of the 2005 ACM SIGPLAN conference on Programming language design and implementation , PLDI '05, pp.269-279, 2005.

F. Quillere, S. Rajopadhye, and D. Wilde, Generation of efficient nested loops from polyhedra, International Journal of Parallel Programming, vol.28, 2000.

E. Raman, N. Va-hharajani, R. Rangan, and D. I. August, Spice, Proceedings of the sixth annual IEEE/ACM international symposium on Code generation and optimization , CGO '08, pp.175-184, 2008.
DOI : 10.1145/1356058.1356082

J. Ramanujam and P. Sadayappan, Tiling multidimensional iteration spaces for multicomputers, Journal of Parallel and Distributed Computing, vol.16, issue.2, pp.108-120, 1992.
DOI : 10.1016/0743-7315(92)90027-K

L. Rauchwerger, N. M. Amato, and D. A. Padua, Run-time methods for parallelizing partially parallel loops, Proceedings of the 9th international conference on Supercomputing , ICS '95, pp.137-146, 1995.
DOI : 10.1145/224538.224553

L. Rauchwerger, N. M. Amato, and D. A. Padua, A scalable method for run-time loop parallelization, International Journal of Parallel Programming, vol.4, issue.1, pp.26-32, 1995.
DOI : 10.1007/BF02577866

L. Rauchwerger and D. Padua, The privatizing DOALL test, Proceedings of the 8th international conference on Supercomputing , ICS '94, pp.33-43, 1994.
DOI : 10.1145/181181.181254

L. Rauchwerger and D. Padua, The LRPD test: speculative runtime parallelization of loops with privatization and reduction parallelization, Proceedings of the ACM SIGPLAN 1995 conference on Programming language design and implementation, PLDI '95, pp.218-232, 1995.

L. Renganarayanan, D. Kim, S. Rajopadhye, and M. M. Strout, Parameterized tiled loops for free, Proceedings of the 2007 ACM SIGPLAN conference on Programming language design and implementation , PLDI '07, pp.405-414, 2007.
DOI : 10.1145/1273442.1250780
URL : http://citeseerx.ist.psu.edu/viewdoc/summary?doi=10.1.1.125.4572

L. Renganarayanan and S. V. Rajopadhye, Positivity, posynomials and tile size selection, SC'08, 2008.

T. Romer, G. Voelker, D. Lee, A. Wolman, W. Wong et al., Instrumentation and optimization of win32/intel executables using etch, Proceedings of the USENIX Windows NT Workshop, pp.1-7, 1997.

S. Rus, M. Pennings, and L. Rauchwerger, Sensitivity analysis for automatic parallelization on multi-cores, Proceedings of the 21st annual international conference on Supercomputing, ICS '07, pp.263-273, 2007.
DOI : 10.1145/1274971.1275008

S. Rus, L. Rauchwerger, and J. Hoeflinger, Hybrid analysis, Proceedings of the 16th international conference on Supercomputing , ICS '02, pp.251-283, 2003.
DOI : 10.1145/514191.514229

H. Joel, R. Saltz, and . Mirchandaney, The preprocessed doacross loop, International Conference on Parallel Processing, pp.174-179, 1991.

J. H. Saltz, R. Mirchandaney, and K. Crowley, The doconsider loop, Proceedings of the 3rd international conference on Supercomputing , ICS '89, pp.29-40, 1989.
DOI : 10.1145/318789.318794

A. Schrijver, Theory of linear and integer programming, 1986.

B. Schwarz, S. Debray, G. Andrews, and M. Legendre, Plto: A link-time optimizer for the intel ia-32 architecture, Proc. 2001 Workshop on Binary Translation (WBT-2001), 2001.

A. Srivastava and A. Eustace, Atom: A system for building customized program analysis tools, pp.196-205, 1994.

M. Mills-strout, L. Carter, and J. Ferrante, Compile-time composition of run-time data and iteration reorderings, Proceedings of the ACM SIGPLAN 2003 conference on Programming language design and implementation , PLDI '03, pp.91-102, 2003.

H. Sutter, The Free Lunch Is Over: A Fundamental Turn Toward Concurrency in Software, Dr. Dobb's Journal, vol.30, issue.3, 2005.

N. Thomas, G. Tanase, O. Tkachyshyn, J. Perdue, N. M. Amato et al., A framework for adaptive algorithm selection in STAPL, Proceedings of the tenth ACM SIGPLAN symposium on Principles and practice of parallel programming , PPoPP '05, pp.277-288, 2005.
DOI : 10.1145/1065944.1065981

C. Tian, M. Feng, and R. Gupta, Speculative parallelization using state separation and multiple value prediction, Proceedings of the 2010 international symposium on Memory management, ISMM '10, pp.63-72, 2010.
DOI : 10.1145/1806651.1806663
URL : http://citeseerx.ist.psu.edu/viewdoc/summary?doi=10.1.1.193.3128

C. Tian, M. Feng, V. Nagarajan, and R. Gupta, Copy or discard execution model for speculative parallelization on multicores, Proceedings of the 41st annual IEEE/ACM International Symposium on Microarchitecture, MICRO 41, pp.330-341, 2008.

K. Tian, Y. Jiang, E. Z. Zhang, and X. Shen, An input-centric paradigm for program dynamic optimizations, OOPSLA '10, pp.125-139, 2010.

S. Triantafyllis, M. Vachharajani, N. Vachharajani, and D. I. August, Compiler optimization-space exploration, International Symposium on Code Generation and Optimization, 2003. CGO 2003., pp.204-215, 2003.
DOI : 10.1109/CGO.2003.1191546
URL : http://citeseerx.ist.psu.edu/viewdoc/summary?doi=10.1.1.131.1622

K. Trifunovic, A. Cohen, D. Edelsohn, F. Li, T. Grosser et al., GRAPHITE Two Years After: First Lessons Learned From Real-World Polyhedral Compilation, GCC Research Opportunities Workshop (GROW'10), 2010.
URL : https://hal.archives-ouvertes.fr/inria-00551516

L. Van-put, D. Chanet, B. De-bus, B. De-sutler, and K. De-bosschere, DIABLO: a reliable, retargetable and extensible link-time rewriting framework, Proceedings of the Fifth IEEE International Symposium on Signal Processing and Information Technology, 2005., 2005.
DOI : 10.1109/ISSPIT.2005.1577061

N. Vasilache, A. Cohen, and L. Pouchet, Automatic Correction of Loop Transformations, 16th International Conference on Parallel Architecture and Compilation Techniques (PACT 2007), pp.292-304, 2007.
DOI : 10.1109/PACT.2007.4336220
URL : https://hal.archives-ouvertes.fr/hal-01257283

S. Verdoolaege, isl: An Integer Set Library for the Polyhedral Model, Proceedings of the Third international congress conference on Mathematical software, ICMS'10, pp.299-302, 2010.
DOI : 10.1007/978-3-642-15582-6_49

S. Verdoolaege, R. Seghir, K. Beyls, V. Loechner, and M. Bruynooghe, Counting Integer Points in Parametric Polytopes Using Barvinok's Rational Functions, Algorithmica, vol.48, issue.1, pp.37-66, 2007.
DOI : 10.1007/s00453-006-1231-0

J. Michael, R. Voss, and . Eigemann, High-level adaptive program optimization with ADAPT, PPoPP '01, pp.93-102, 2001.

R. , C. Whaley, A. Petitet, and J. Dongarra, Automated empirical optimizations of software and the ATLAS project, Parallel Computing, vol.27, issue.1 2, pp.3-35, 2001.

E. Michael, M. S. Wolf, and . Lam, A data locality optimizing algorithm, PLDI '91: Proceedings of the ACM SIGPLAN 1991 conference on Programming language design and implementation, pp.30-44, 1991.

M. Wolfe, Optimizing supercompilers for supercomputers, p.8303027, 1982.

J. Yang, K. Skadron, M. Soffa, and K. Whitehouse, Feasibility of dynamic binary parallelization, 2011.

E. Yard?mc? and M. Franz, Dynamic parallelization and mapping of binary executables on hierarchical platforms, Proceedings of the 3rd conference on Computing frontiers, CF '06, pp.127-138, 2006.

H. Zhong, M. Mehrara, S. A. Lieberman, and S. A. Mahlke, Uncovering hidden loop level parallelism in sequential applications, HPCA, pp.290-301, 2008.

C. Zhu and P. Yew, A scheme to enforce data dependence on large multiprocessor systems, IEEE Trans. Softw. Eng, vol.13, pp.726-739, 1987.