Skip to Main content Skip to Navigation
Journal articles

Towards ultra-scale Branch-and-Bound using a high-productivity language

Tiago Carneiro 1, * Jan Gmys 1 Nouredine Melab 1 Daniel Tuyttens 2
* Corresponding author
1 BONUS - Optimisation de grande taille et calcul large échelle
Inria Lille - Nord Europe, CRIStAL - Centre de Recherche en Informatique, Signal et Automatique de Lille - UMR 9189
Abstract : Due to the highly irregular nature and prohibitive execution times of Branch-and-Bound (B&B) algorithms applied to combinatorial optimization problems (COPs), their parallelization has received these two last decades great attention. Indeed, significant efforts have been made to revisit the parallelization of B&B following the rapid evolution of high-performance computing technologies dealing with their associated scientific and technical challenges. However, these parallelization efforts have always been guided by the performance objective setting aside programming productivity. Nevertheless, this latter is crucial for designing ultra-scale algorithms able to harness modern supercomputers which are increasingly complex, including millions of processing cores and heterogeneous building-block devices. In this paper, we investigate the partitioned global address space (PGAS)-based approach using Chapel for the productivity-aware design and implementation of distributed B&B for solving large COPs. The proposed algorithms are intensively experimented using the Flow-shop scheduling problem as a test-case. The Chapel-based implementation is compared to its MPI+X-based traditionally used counterpart in terms of performance, scalabil-ity, and productivity. The results show that Chapel is much more expressive and up to 7.8× more productive than MPI+Pthreads. In addition, the Chapel-based search presents performance equivalent to MPI+Pthreads for its best results on 1024 cores and reaches up to 84% of the linear speedup. However, there are cases where the built-in load balancing provided by Chapel cannot produce regular load among computer nodes. In such cases, the MPI-based search can be up to 4.2× faster and reaches speedups up to 3× higher than its Chapel-based counterpart. Thorough feedback on the experience is given, pointing out the strengths and limitations of the two opposite approaches (Chapel vs. MPI+X). To the best of our knowledge, the present study is pioneering within the context of exact parallel optimization.
Complete list of metadata

Cited literature [56 references]  Display  Hide  Download
Contributor : Tiago Carneiro Pessoa <>
Submitted on : Tuesday, November 19, 2019 - 6:10:22 PM
Last modification on : Tuesday, December 15, 2020 - 10:59:52 AM


Files produced by the author(s)



Tiago Carneiro, Jan Gmys, Nouredine Melab, Daniel Tuyttens. Towards ultra-scale Branch-and-Bound using a high-productivity language. Future Generation Computer Systems, Elsevier, 2020, SI: On The Road to Exascale II: Advances on High Performance Computing and Simulations, 105, pp.196-209. ⟨10.1016/j.future.2019.11.011⟩. ⟨hal-02371238⟩



Record views


Files downloads