Scalable fine-grained metric-based remeshing algorithm for manycore/NUMA architectures

Abstract : In this paper, we present a fine-grained multi-stage metric-based triangular remeshing algorithm on manycore and NUMA architectures. It is motivated by the dynamically evolving data dependencies and workload of such irregular algorithms, often resulting in poor performance and data locality at high number of cores. In this context, we devise a multi-stage algorithm in which a task graph is built for each kernel. Parallelism is then extracted through fine-grained independent set, maximal cardinality matching and graph coloring heuristics. In addition to index ranges precalculation, a dual-step atomic-based synchronization scheme is used for nodal data updates. Despite its intractable latency-boundness, a good overall scalability is achieved on a NUMA dual-socket Intel Haswell and a dual-memory Intel KNL computing nodes (64 cores). The relevance of our synchronization scheme is highlighted through a comparison with the state-of-the-art.
Document type :
Conference papers
Complete list of metadatas

Cited literature [13 references]  Display  Hide  Download

https://hal.archives-ouvertes.fr/hal-01609940
Contributor : Frédéric Davesne <>
Submitted on : Thursday, October 10, 2019 - 12:57:48 PM
Last modification on : Monday, October 28, 2019 - 10:50:22 AM

File

RLPlG-EP-2017.pdf
Files produced by the author(s)

Identifiers

Citation

Hoby Rakotoarivelo, Franck Ledoux, Franck Pommereau, Nicolas Le Goff. Scalable fine-grained metric-based remeshing algorithm for manycore/NUMA architectures. 23rd International Conference on Parallel and Distributed Computing (Euro-Par 2017), Aug 2017, Santiago de Compostela, Spain. pp.594--606, ⟨10.1007/978-3-319-64203-1_43⟩. ⟨hal-01609940⟩

Share

Metrics

Record views

192

Files downloads

41