Fix fork waiter leak in nsync

This change fixes a bug where nsync waiter objects would leak. It'd mean
that long-running programs like runitd would run out of file descriptors
on NetBSD where waiter objects have ksem file descriptors. On other OSes
this bug is mostly harmless since the worst that can happen with a futex
is to leak a little bit of ram. The bug was caused because tib_nsync was
sneaking back in after the finalization code had cleared it. This change
refactors the thread exiting code to handle nsync teardown appropriately
and in making this change I found another issue, which is that user code
which is buggy, and tries to exit without joining joinable threads which
haven't been detached, would result in a deadlock. That doesn't sound so
bad, except the main thread is a joinable thread. So this deadlock would
be triggered in ways that put libc at fault. So we now auto-join threads
and libc will log a warning to --strace when that happens for any thread
This commit is contained in:
Justine Tunney 2024-12-31 00:55:15 -08:00
parent fd7da586b5
commit 98c5847727
No known key found for this signature in database
GPG key ID: BE714B4575D6E328
35 changed files with 299 additions and 173 deletions

View file

@ -59,7 +59,6 @@ extern pthread_mutex_t __sig_worker_lock;
void __dlopen_lock(void);
void __dlopen_unlock(void);
void nsync_mu_semaphore_sem_fork_child(void);
// first and last and always
// it is the lord of all locks
@ -147,7 +146,6 @@ static void fork_parent(void) {
}
static void fork_child(void) {
nsync_mu_semaphore_sem_fork_child();
_pthread_mutex_wipe_np(&__dlopen_lock_obj);
_pthread_mutex_wipe_np(&__rand64_lock_obj);
_pthread_mutex_wipe_np(&__fds_lock_obj);
@ -204,8 +202,8 @@ int _fork(uint32_t dwCreationFlags) {
struct CosmoTib *tib = __get_tls();
struct PosixThread *pt = (struct PosixThread *)tib->tib_pthread;
tid = IsLinux() || IsXnuSilicon() ? dx : sys_gettid();
atomic_init(&tib->tib_tid, tid);
atomic_init(&pt->ptid, tid);
atomic_init(&tib->tib_ctid, tid);
atomic_init(&tib->tib_ptid, tid);
// tracing and kisdangerous need this lock wiped a little earlier
atomic_init(&__maps.lock.word, 0);
@ -214,6 +212,11 @@ int _fork(uint32_t dwCreationFlags) {
* it's now safe to call normal functions again
*/
// this wipe must happen fast
void nsync_waiter_wipe_(void);
if (_weaken(nsync_waiter_wipe_))
_weaken(nsync_waiter_wipe_)();
// turn other threads into zombies
// we can't free() them since we're monopolizing all locks
// we assume the operating system already reclaimed system handles