Skip to Main content Skip to Navigation
Conference papers

Does Shared-Memory, Highly Multi-Threaded, Single-Application Scale on Many-Cores?

Abstract : Nowadays, single-chip cache-coherent multi-cores up to 100 cores are a reality. Many-cores of hundreds of cores are planned in the near future. Due to the large number of cores and for power efficiency reasons (performance per watt), cores become simpler with small caches. To get efficient use of parallelism offered by these architectures, applications must be multi-threads. The POSIX Threads (PThreads) standard is the most portable way to use threads across operating systems. It is also used as a low-level layer to support other portable, shared-memory, parallel environments like OpenMP. In this paper, we propose to verify experimentally the scalability of shared-memory, PThreads based, applications, on Cycle-Accurate-Bit-Accurate (CABA) simulated, 512-cores. Using two unmodified highly multi-threads applications, SPLASH-2 FFT, and EPFilter (medical images noise-filtering application provided by Phillips) our study shows a scalability limitation beyond 64 cores for FFT and 256 cores for EPFilter. Based on hardware events counters, our analysis shows: (i) the detected scalability limitation is a conceptual problem related to the notion of thread and process; and (ii) the small per-core caches found in many-cores exacerbates the problem. Finally, we present our solution in principle and future work.
Complete list of metadata
Contributor : Ghassan Almaless Connect in order to contact the contributor
Submitted on : Wednesday, October 17, 2012 - 4:52:06 PM
Last modification on : Friday, January 8, 2021 - 5:32:08 PM


  • HAL Id : hal-00742947, version 1


Ghassan Almaless, Franck Wajsburt. Does Shared-Memory, Highly Multi-Threaded, Single-Application Scale on Many-Cores?. 4th USENIX Workshop on Hot Topics in Parallelism, Jun 2012, Berkeley, CA, United States. ⟨hal-00742947⟩



Record views