[Bug sanitizer/90589] In Fedora 30 ps hangs using address sanitizer
mathieu.desnoyers at efficios dot com
gcc-bugzilla@gcc.gnu.org
Thu Aug 12 16:17:06 GMT 2021
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=90589
Mathieu Desnoyers <mathieu.desnoyers at efficios dot com> changed:
What |Removed |Added
----------------------------------------------------------------------------
CC| |mathieu.desnoyers@efficios.
| |com
--- Comment #13 from Mathieu Desnoyers <mathieu.desnoyers at efficios dot com> ---
So I reproduced this hang with pgrep on my Ubuntu 18.04.1 LTS.
LD_PRELOAD=/usr/lib/gcc/x86_64-linux-gnu/{8,9,10,11}/libasan.so pgrep something
[ hang ]
but
LD_PRELOAD=/usr/lib/gcc/x86_64-linux-gnu/7/libasan.so pgrep something [ does
not hang ]
So I focus my investigation on gcc 8's libasan which is the first to introduce
this issue.
The version of glibc on this x86-64 machine is 2.27-3ubuntu1.4.
I added breakpoints on all rwlock read lock, write lock, and unlock operations
(__GI___pthread_rwlock_rdlock, __GI___pthread_rwlock_wrlock,
__GI___pthread_rwlock_unlock). Looking at what happens to the _nl_state_lock
which protects glibc's i18n handling data structures is quite enlightening.
This lock is _not_ a recursive lock. AFAIU what happens here is that:
1. intl/bindtextdom.c:set_binding_values() locks the _nl_state_lock write lock.
2. it calls malloc, but this symbol is overridden by libasan.
3. asan's malloc handler ReplaceSystemMalloc looks up "__libc_malloc_dispatch"
and if that fails it looks up "__libc_malloc_default_dispatch".
4. The dlsym lookup failure calls __dlerror, which performs a i18n lookup, thus
taking the _nl_state_lock read lock. This should not happen, as it is not a
nestable lock.
5. The _nl_state_lock is unlocked once after the i18n lookup is done.
6. The _nl_state lock is unlocked again in set_binding_values(), which corrupts
the lock state because it is not a nestable lock.
7. The next unlucky caller trying to take the lock hangs forever on futex.
Based on an IRC discussion with Carlos O'Donell, it appears to be a defect in
glibc that a malloc interposer fails during i18n translation. There was a
discussion back in 2014
(https://sourceware.org/legacy-ml/libc-alpha/2014-12/msg00954.html) regarding
reentrancy of dlopen and other libdl interfaces.
More information about the Gcc-bugs
mailing list