Locking/Mutex primitives in Pthreads

Spin locks


On x86_64, pthread_spin_lock is implemented by lock decl followed by pause. pthread_spin_trylock is implemented by lock cmpxchg followed by cmove (move if equal)



The implementation of Mutex in Glibc changes from version to version. The latest version uses low-level locks (in Glibc source tree, nptl/sysdeps/unix/sysv/linux/x86_64/lowlevelock.h) in conjunction with the Fast Userspace Locking system call futex.

POSIX requires that a mutex to have an attribute called robustness (see here). If thread owning a robust mutex terminates while holding the mutex, the next thread that acquires the mutex shall be notified about the termination by the erroneous return value EOWNERDEAD. What the notified thread can do is either

POSIX also requires that a mutex to have an attribute called protocol (see here). By default, the protocol is None. If it is not None, it can be:

Read/write locks


Conditional variables


Non-standard Pthread call


Locking/Mutex primitives in GNU OpenMP

GCC supports OpenMP since version 4.x.0, and it uses GNU OpenMP. Details of OpenMP and automatic parallelization in GCC can be found in Diego Novillo's paper and slides


gomp_mutex_{lock|unlock} and gomp_ptrlock_{get|set}, which are inline functions.

gomp_mutex_* are for integers, while gomp_ptrlock_* work for pointers.

Depending on how libgomp is built (when running ./configure script, whether --enable-linux-futex, which is default, or --disable-linux-futex), gomp_mutex_{lock|unlock} will either use its own implementation (as in libgomp/config/linux/mutex.h), or use pthread_mutex_{lock|unlock} (as in libgomp/config/linux/posix.h)

In the former case, gomp_mutex_{lock|unlock} will do a quick test using the GCC built-in __sync_lock_test_and_set function, and if it fails, it falls through to gomp_mutex_{lock|unlock}_slow.

gomp_mutex_lock_slow will spin a couple million or billion times (can be controlled by the environmental variable OMP_WAIT_POLICY and GOMP_SPINCOUNT, and this spin count also depends on whether there is any oversubscription, i.e. more threads than the number of CPUs) before using the Fast Userspace Locking system call futex.

gomp_mutex_unlock_slow will simply call futex with opcode being FUTEX_WAKE.

gomp_ptrlock_{get|set}_slow are similar.


gomp_sem_{wait|post}, which are inline functions.

As in previous case, there are two implementations. The POSIX version uses pthread_mutex_{lock|unlock} and pthread_cond_{wait|signal}

The Linux version is very much like mutexes: gomp_sem_{wait|post} will do a quick test using the built-in __sync_bool_compare_and_swap function, and if it fails, it falls through to gomp_sem_{wait|post}_slow.

Semaphores here are used to implement Barriers.

Locking/Mutex primitives in Intel OpenMP

There are certain functions which seem related to locks (the intiail "k" refers to KAI, Kuck & Associates, Inc, which was a high-performance compiler vendor and was acquired by Intel in 2000.)
and some related to profiling:
Also note that the global variable __kmp_lock_method will be set to 1 if HyperThreading is available, and 2 otherwise.