Skip to Main content Skip to Navigation
New interface
Journal articles

Fast and Portable Locking for Multicore Architectures

Jean-Pierre Lozi 1 Florian David 2 Gaël Thomas 3, 4, 5 Julia Lawall 2 Gilles Muller 2 
1 Laboratoire d'Informatique, Signaux, et Systèmes de Sophia-Antipolis (I3S) / Equipe MODALIS
Laboratoire I3S - SPARKS - Scalable and Pervasive softwARe and Knowledge Systems
2 Whisper - Well Honed Infrastructure Software for Programming Environments and Runtimes
LIP6 - Laboratoire d'Informatique de Paris 6, Inria de Paris
4 ACMES-SAMOVAR - Algorithmes, Composants, Modèles Et Services pour l'informatique répartie
SAMOVAR - Services répartis, Architectures, MOdélisation, Validation, Administration des Réseaux
Abstract : The scalability of multithreaded applications on current multicore systems is hampered by the performance of lock algorithms, due to the costs of access contention and cache misses. The main contribution presented in this article is a new locking technique, Remote Core Locking (RCL), that aims to accelerate the execution of critical sections in legacy applications on multicore architectures. The idea of RCL is to replace lock acquisitions by optimized remote procedure calls to a dedicated server hardware thread. RCL limits the performance collapse observed with other lock algorithms when many threads try to acquire a lock concurrently and removes the need to transfer lock-protected shared data to the hardware thread acquiring the lock, because such data can typically remain in the server's cache. Other contributions presented in this article include a profiler that identifies the locks that are the bottlenecks in multithreaded applications and that can thus benefit from RCL, and a reengineering tool that transforms POSIX lock acquisitions into RCL locks. Eighteen applications were used to evaluate RCL: the nine applications of the SPLASH-2 benchmark suite, the seven applications of the Phoenix 2 benchmark suite, Memcached, and Berkeley DB with a TPC-C client. Eight of these applications are unable to scale because of locks and benefit from RCL on an ×86 machine with four AMD Opteron processors and 48 hardware threads. By using RCL instead of Linux POSIX locks, performance is improved by up to 2.5 times on Memcached, and up to 11.6 times on Berkeley DB with the TPC-C client. On a SPARC machine with two Sun Ultrasparc T2+ processors and 128 hardware threads, three applications benefit from RCL. In particular, performance is improved by up to 1.3 times with respect to Solaris POSIX locks on Memcached, and up to 7.9 times on Berkeley DB with the TPC-C client.. Permission to make digital or hard copies of part or all of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies show this notice on the first page or initial screen of a display along with the full citation. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, to republish, to post on servers, to redistribute to lists, or to use any component of this work in other works requires prior specific permission and/or a fee. Permissions may be requested from Publications
Document type :
Journal articles
Complete list of metadata

Cited literature [53 references]  Display  Hide  Download
Contributor : Gilles Muller Connect in order to contact the contributor
Submitted on : Thursday, January 7, 2016 - 1:31:21 PM
Last modification on : Thursday, August 4, 2022 - 4:53:55 PM
Long-term archiving on: : Friday, April 8, 2016 - 1:23:24 PM


Files produced by the author(s)



Jean-Pierre Lozi, Florian David, Gaël Thomas, Julia Lawall, Gilles Muller. Fast and Portable Locking for Multicore Architectures. ACM Transactions on Computer Systems, 2016, ⟨10.1145/2845079⟩. ⟨hal-01252167⟩



Record views


Files downloads