cosmopolitan/third_party/libcxx/__threading_support
Justine Tunney ec480f5aa0
Make improvements
- Every unit test now passes on Apple Silicon. The final piece of this
  puzzle was porting our POSIX threads cancelation support, since that
  works differently on ARM64 XNU vs. AMD64. Our semaphore support on
  Apple Silicon is also superior now compared to AMD64, thanks to the
  grand central dispatch library which lets *NSYNC locks go faster.

- The Cosmopolitan runtime is now more stable, particularly on Windows.
  To do this, thread local storage is mandatory at all runtime levels,
  and the innermost packages of the C library is no longer being built
  using ASAN. TLS is being bootstrapped with a 128-byte TIB during the
  process startup phase, and then later on the runtime re-allocates it
  either statically or dynamically to support code using _Thread_local.
  fork() and execve() now do a better job cooperating with threads. We
  can now check how much stack memory is left in the process or thread
  when functions like kprintf() / execve() etc. call alloca(), so that
  ENOMEM can be raised, reduce a buffer size, or just print a warning.

- POSIX signal emulation is now implemented the same way kernels do it
  with pthread_kill() and raise(). Any thread can interrupt any other
  thread, regardless of what it's doing. If it's blocked on read/write
  then the killer thread will cancel its i/o operation so that EINTR can
  be returned in the mark thread immediately. If it's doing a tight CPU
  bound operation, then that's also interrupted by the signal delivery.
  Signal delivery works now by suspending a thread and pushing context
  data structures onto its stack, and redirecting its execution to a
  trampoline function, which calls SetThreadContext(GetCurrentThread())
  when it's done.

- We're now doing a better job managing locks and handles. On NetBSD we
  now close semaphore file descriptors in forked children. Semaphores on
  Windows can now be canceled immediately, which means mutexes/condition
  variables will now go faster. Apple Silicon semaphores can be canceled
  too. We're now using Apple's pthread_yield() funciton. Apple _nocancel
  syscalls are now used on XNU when appropriate to ensure pthread_cancel
  requests aren't lost. The MbedTLS library has been updated to support
  POSIX thread cancelations. See tool/build/runitd.c for an example of
  how it can be used for production multi-threaded tls servers. Handles
  on Windows now leak less often across processes. All i/o operations on
  Windows are now overlapped, which means file pointers can no longer be
  inherited across dup() and fork() for the time being.

- We now spawn a thread on Windows to deliver SIGCHLD and wakeup wait4()
  which means, for example, that posix_spawn() now goes 3x faster. POSIX
  spawn is also now more correct. Like Musl, it's now able to report the
  failure code of execve() via a pipe although our approach favors using
  shared memory to do that on systems that have a true vfork() function.

- We now spawn a thread to deliver SIGALRM to threads when setitimer()
  is used. This enables the most precise wakeups the OS makes possible.

- The Cosmopolitan runtime now uses less memory. On NetBSD for example,
  it turned out the kernel would actually commit the PT_GNU_STACK size
  which caused RSS to be 6mb for every process. Now it's down to ~4kb.
  On Apple Silicon, we reduce the mandatory upstream thread size to the
  smallest possible size to reduce the memory overhead of Cosmo threads.
  The examples directory has a program called greenbean which can spawn
  a web server on Linux with 10,000 worker threads and have the memory
  usage of the process be ~77mb. The 1024 byte overhead of POSIX-style
  thread-local storage is now optional; it won't be allocated until the
  pthread_setspecific/getspecific functions are called. On Windows, the
  threads that get spawned which are internal to the libc implementation
  use reserve rather than commit memory, which shaves a few hundred kb.

- sigaltstack() is now supported on Windows, however it's currently not
  able to be used to handle stack overflows, since crash signals are
  still generated by WIN32. However the crash handler will still switch
  to the alt stack, which is helpful in environments with tiny threads.

- Test binaries are now smaller. Many of the mandatory dependencies of
  the test runner have been removed. This ensures many programs can do a
  better job only linking the the thing they're testing. This caused the
  test binaries for LIBC_FMT for example, to decrease from 200kb to 50kb

- long double is no longer used in the implementation details of libc,
  except in the APIs that define it. The old code that used long double
  for time (instead of struct timespec) has now been thoroughly removed.

- ShowCrashReports() is now much tinier in MODE=tiny. Instead of doing
  backtraces itself, it'll just print a command you can run on the shell
  using our new `cosmoaddr2line` program to view the backtrace.

- Crash report signal handling now works in a much better way. Instead
  of terminating the process, it now relies on SA_RESETHAND so that the
  default SIG_IGN behavior can terminate the process if necessary.

- Our pledge() functionality has now been fully ported to AARCH64 Linux.
2023-09-18 21:04:47 -07:00

494 lines
13 KiB
C++

// -*- C++ -*-
//===----------------------------------------------------------------------===//
//
// Part of the LLVM Project, under the Apache License v2.0 with LLVM Exceptions.
// See https://llvm.org/LICENSE.txt for license information.
// SPDX-License-Identifier: Apache-2.0 WITH LLVM-exception
//
//===----------------------------------------------------------------------===//
#ifndef _LIBCPP_THREADING_SUPPORT
#define _LIBCPP_THREADING_SUPPORT
#include "third_party/libcxx/__config"
#include "third_party/libcxx/chrono"
#include "third_party/libcxx/iosfwd"
#include "libc/thread/thread2.h"
#include "libc/thread/thread.h"
#include "third_party/libcxx/errno.h"
#ifndef _LIBCPP_HAS_NO_PRAGMA_SYSTEM_HEADER
#pragma GCC system_header
#endif
#if defined(_LIBCPP_HAS_THREAD_API_EXTERNAL)
# include "third_party/libcxx/__external_threading"
#elif !defined(_LIBCPP_HAS_NO_THREADS)
#if defined(_LIBCPP_HAS_THREAD_API_PTHREAD)
#include "libc/thread/thread.h"
#include "libc/calls/calls.h"
#include "libc/calls/struct/sched_param.h"
#include "libc/sysv/consts/sched.h"
#endif
#if defined(_LIBCPP_HAS_THREAD_LIBRARY_EXTERNAL) || \
defined(_LIBCPP_BUILDING_THREAD_LIBRARY_EXTERNAL) || \
defined(_LIBCPP_HAS_THREAD_API_WIN32)
#define _LIBCPP_THREAD_ABI_VISIBILITY _LIBCPP_FUNC_VIS
#else
#define _LIBCPP_THREAD_ABI_VISIBILITY inline _LIBCPP_INLINE_VISIBILITY
#endif
#if defined(__FreeBSD__) && defined(__clang__) && __has_attribute(no_thread_safety_analysis)
#define _LIBCPP_NO_THREAD_SAFETY_ANALYSIS __attribute__((no_thread_safety_analysis))
#else
#define _LIBCPP_NO_THREAD_SAFETY_ANALYSIS
#endif
typedef ::timespec __libcpp_timespec_t;
#endif // !defined(_LIBCPP_HAS_NO_THREADS)
_LIBCPP_PUSH_MACROS
#include "third_party/libcxx/__undef_macros"
_LIBCPP_BEGIN_NAMESPACE_STD
#if !defined(_LIBCPP_HAS_NO_THREADS)
#if defined(_LIBCPP_HAS_THREAD_API_PTHREAD)
// Mutex
typedef pthread_mutex_t __libcpp_mutex_t;
#define _LIBCPP_MUTEX_INITIALIZER PTHREAD_MUTEX_INITIALIZER
typedef pthread_mutex_t __libcpp_recursive_mutex_t;
// Condition Variable
typedef pthread_cond_t __libcpp_condvar_t;
#define _LIBCPP_CONDVAR_INITIALIZER PTHREAD_COND_INITIALIZER
// Execute once
typedef pthread_once_t __libcpp_exec_once_flag;
#define _LIBCPP_EXEC_ONCE_INITIALIZER PTHREAD_ONCE_INIT
// Thread id
typedef pthread_t __libcpp_thread_id;
// Thread
#define _LIBCPP_NULL_THREAD 0U
typedef pthread_t __libcpp_thread_t;
// Thread Local Storage
typedef pthread_key_t __libcpp_tls_key;
#define _LIBCPP_TLS_DESTRUCTOR_CC
#elif !defined(_LIBCPP_HAS_THREAD_API_EXTERNAL)
// Mutex
typedef void* __libcpp_mutex_t;
#define _LIBCPP_MUTEX_INITIALIZER 0
#if defined(_M_IX86) || defined(__i386__) || defined(_M_ARM) || defined(__arm__)
typedef void* __libcpp_recursive_mutex_t[6];
#elif defined(_M_AMD64) || defined(__x86_64__) || defined(_M_ARM64) || defined(__aarch64__)
typedef void* __libcpp_recursive_mutex_t[5];
#else
# error Unsupported architecture
#endif
// Condition Variable
typedef void* __libcpp_condvar_t;
#define _LIBCPP_CONDVAR_INITIALIZER 0
// Execute Once
typedef void* __libcpp_exec_once_flag;
#define _LIBCPP_EXEC_ONCE_INITIALIZER 0
// Thread ID
typedef long __libcpp_thread_id;
// Thread
#define _LIBCPP_NULL_THREAD 0U
typedef void* __libcpp_thread_t;
// Thread Local Storage
typedef long __libcpp_tls_key;
#define _LIBCPP_TLS_DESTRUCTOR_CC __stdcall
#endif // !defined(_LIBCPP_HAS_THREAD_API_PTHREAD) && !defined(_LIBCPP_HAS_THREAD_API_EXTERNAL)
#if !defined(_LIBCPP_HAS_THREAD_API_EXTERNAL)
// Mutex
_LIBCPP_THREAD_ABI_VISIBILITY
int __libcpp_recursive_mutex_init(__libcpp_recursive_mutex_t *__m);
_LIBCPP_THREAD_ABI_VISIBILITY _LIBCPP_NO_THREAD_SAFETY_ANALYSIS
int __libcpp_recursive_mutex_lock(__libcpp_recursive_mutex_t *__m);
_LIBCPP_THREAD_ABI_VISIBILITY _LIBCPP_NO_THREAD_SAFETY_ANALYSIS
bool __libcpp_recursive_mutex_trylock(__libcpp_recursive_mutex_t *__m);
_LIBCPP_THREAD_ABI_VISIBILITY _LIBCPP_NO_THREAD_SAFETY_ANALYSIS
int __libcpp_recursive_mutex_unlock(__libcpp_recursive_mutex_t *__m);
_LIBCPP_THREAD_ABI_VISIBILITY
int __libcpp_recursive_mutex_destroy(__libcpp_recursive_mutex_t *__m);
_LIBCPP_THREAD_ABI_VISIBILITY _LIBCPP_NO_THREAD_SAFETY_ANALYSIS
int __libcpp_mutex_lock(__libcpp_mutex_t *__m);
_LIBCPP_THREAD_ABI_VISIBILITY _LIBCPP_NO_THREAD_SAFETY_ANALYSIS
bool __libcpp_mutex_trylock(__libcpp_mutex_t *__m);
_LIBCPP_THREAD_ABI_VISIBILITY _LIBCPP_NO_THREAD_SAFETY_ANALYSIS
int __libcpp_mutex_unlock(__libcpp_mutex_t *__m);
_LIBCPP_THREAD_ABI_VISIBILITY
int __libcpp_mutex_destroy(__libcpp_mutex_t *__m);
// Condition variable
_LIBCPP_THREAD_ABI_VISIBILITY
int __libcpp_condvar_signal(__libcpp_condvar_t* __cv);
_LIBCPP_THREAD_ABI_VISIBILITY
int __libcpp_condvar_broadcast(__libcpp_condvar_t* __cv);
_LIBCPP_THREAD_ABI_VISIBILITY _LIBCPP_NO_THREAD_SAFETY_ANALYSIS
int __libcpp_condvar_wait(__libcpp_condvar_t* __cv, __libcpp_mutex_t* __m);
_LIBCPP_THREAD_ABI_VISIBILITY _LIBCPP_NO_THREAD_SAFETY_ANALYSIS
int __libcpp_condvar_timedwait(__libcpp_condvar_t *__cv, __libcpp_mutex_t *__m,
__libcpp_timespec_t *__ts);
_LIBCPP_THREAD_ABI_VISIBILITY
int __libcpp_condvar_destroy(__libcpp_condvar_t* __cv);
// Execute once
_LIBCPP_THREAD_ABI_VISIBILITY
int __libcpp_execute_once(__libcpp_exec_once_flag *flag,
void (*init_routine)());
// Thread id
_LIBCPP_THREAD_ABI_VISIBILITY
bool __libcpp_thread_id_equal(__libcpp_thread_id t1, __libcpp_thread_id t2);
_LIBCPP_THREAD_ABI_VISIBILITY
bool __libcpp_thread_id_less(__libcpp_thread_id t1, __libcpp_thread_id t2);
// Thread
_LIBCPP_THREAD_ABI_VISIBILITY
bool __libcpp_thread_isnull(const __libcpp_thread_t *__t);
_LIBCPP_THREAD_ABI_VISIBILITY
int __libcpp_thread_create(__libcpp_thread_t *__t, void *(*__func)(void *),
void *__arg);
_LIBCPP_THREAD_ABI_VISIBILITY
__libcpp_thread_id __libcpp_thread_get_current_id();
_LIBCPP_THREAD_ABI_VISIBILITY
__libcpp_thread_id __libcpp_thread_get_id(const __libcpp_thread_t *__t);
_LIBCPP_THREAD_ABI_VISIBILITY
int __libcpp_thread_join(__libcpp_thread_t *__t);
_LIBCPP_THREAD_ABI_VISIBILITY
int __libcpp_thread_detach(__libcpp_thread_t *__t);
_LIBCPP_THREAD_ABI_VISIBILITY
void __libcpp_thread_yield();
_LIBCPP_THREAD_ABI_VISIBILITY
void __libcpp_thread_sleep_for(const chrono::nanoseconds& __ns);
// Thread local storage
_LIBCPP_THREAD_ABI_VISIBILITY
int __libcpp_tls_create(__libcpp_tls_key* __key,
void(_LIBCPP_TLS_DESTRUCTOR_CC* __at_exit)(void*));
_LIBCPP_THREAD_ABI_VISIBILITY
void *__libcpp_tls_get(__libcpp_tls_key __key);
_LIBCPP_THREAD_ABI_VISIBILITY
int __libcpp_tls_set(__libcpp_tls_key __key, void *__p);
#endif // !defined(_LIBCPP_HAS_THREAD_API_EXTERNAL)
#if (!defined(_LIBCPP_HAS_THREAD_LIBRARY_EXTERNAL) || \
defined(_LIBCPP_BUILDING_THREAD_LIBRARY_EXTERNAL)) && \
defined(_LIBCPP_HAS_THREAD_API_PTHREAD)
int __libcpp_recursive_mutex_init(__libcpp_recursive_mutex_t *__m)
{
pthread_mutexattr_t attr;
int __ec = pthread_mutexattr_init(&attr);
if (__ec)
return __ec;
__ec = pthread_mutexattr_settype(&attr, PTHREAD_MUTEX_RECURSIVE);
if (__ec) {
pthread_mutexattr_destroy(&attr);
return __ec;
}
__ec = pthread_mutex_init(__m, &attr);
if (__ec) {
pthread_mutexattr_destroy(&attr);
return __ec;
}
__ec = pthread_mutexattr_destroy(&attr);
if (__ec) {
pthread_mutex_destroy(__m);
return __ec;
}
return 0;
}
int __libcpp_recursive_mutex_lock(__libcpp_recursive_mutex_t *__m)
{
return pthread_mutex_lock(__m);
}
bool __libcpp_recursive_mutex_trylock(__libcpp_recursive_mutex_t *__m)
{
return pthread_mutex_trylock(__m) == 0;
}
int __libcpp_recursive_mutex_unlock(__libcpp_mutex_t *__m)
{
return pthread_mutex_unlock(__m);
}
int __libcpp_recursive_mutex_destroy(__libcpp_recursive_mutex_t *__m)
{
return pthread_mutex_destroy(__m);
}
int __libcpp_mutex_lock(__libcpp_mutex_t *__m)
{
return pthread_mutex_lock(__m);
}
bool __libcpp_mutex_trylock(__libcpp_mutex_t *__m)
{
return pthread_mutex_trylock(__m) == 0;
}
int __libcpp_mutex_unlock(__libcpp_mutex_t *__m)
{
return pthread_mutex_unlock(__m);
}
int __libcpp_mutex_destroy(__libcpp_mutex_t *__m)
{
return pthread_mutex_destroy(__m);
}
// Condition Variable
int __libcpp_condvar_signal(__libcpp_condvar_t *__cv)
{
return pthread_cond_signal(__cv);
}
int __libcpp_condvar_broadcast(__libcpp_condvar_t *__cv)
{
return pthread_cond_broadcast(__cv);
}
int __libcpp_condvar_wait(__libcpp_condvar_t *__cv, __libcpp_mutex_t *__m)
{
return pthread_cond_wait(__cv, __m);
}
int __libcpp_condvar_timedwait(__libcpp_condvar_t *__cv, __libcpp_mutex_t *__m,
__libcpp_timespec_t *__ts)
{
return pthread_cond_timedwait(__cv, __m, __ts);
}
int __libcpp_condvar_destroy(__libcpp_condvar_t *__cv)
{
return pthread_cond_destroy(__cv);
}
// Execute once
int __libcpp_execute_once(__libcpp_exec_once_flag *flag,
void (*init_routine)()) {
return pthread_once(flag, init_routine);
}
// Thread id
// Returns non-zero if the thread ids are equal, otherwise 0
bool __libcpp_thread_id_equal(__libcpp_thread_id t1, __libcpp_thread_id t2)
{
return pthread_equal(t1, t2) != 0;
}
// Returns non-zero if t1 < t2, otherwise 0
bool __libcpp_thread_id_less(__libcpp_thread_id t1, __libcpp_thread_id t2)
{
return t1 < t2;
}
// Thread
bool __libcpp_thread_isnull(const __libcpp_thread_t *__t) {
return *__t == 0;
}
int __libcpp_thread_create(__libcpp_thread_t *__t, void *(*__func)(void *),
void *__arg)
{
return pthread_create(__t, 0, __func, __arg);
}
__libcpp_thread_id __libcpp_thread_get_current_id()
{
return pthread_self();
}
__libcpp_thread_id __libcpp_thread_get_id(const __libcpp_thread_t *__t)
{
return *__t;
}
int __libcpp_thread_join(__libcpp_thread_t *__t)
{
return pthread_join(*__t, 0);
}
int __libcpp_thread_detach(__libcpp_thread_t *__t)
{
return pthread_detach(*__t);
}
void __libcpp_thread_yield()
{
pthread_yield();
}
void __libcpp_thread_sleep_for(const chrono::nanoseconds& __ns)
{
using namespace chrono;
seconds __s = duration_cast<seconds>(__ns);
__libcpp_timespec_t __ts;
typedef decltype(__ts.tv_sec) ts_sec;
_LIBCPP_CONSTEXPR ts_sec __ts_sec_max = numeric_limits<ts_sec>::max();
if (__s.count() < __ts_sec_max)
{
__ts.tv_sec = static_cast<ts_sec>(__s.count());
__ts.tv_nsec = static_cast<decltype(__ts.tv_nsec)>((__ns - __s).count());
}
else
{
__ts.tv_sec = __ts_sec_max;
__ts.tv_nsec = 999999999; // (10^9 - 1)
}
while (nanosleep(&__ts, &__ts) == -1 && errno == EINTR);
}
// Thread local storage
int __libcpp_tls_create(__libcpp_tls_key *__key, void (*__at_exit)(void *))
{
return pthread_key_create(__key, __at_exit);
}
void *__libcpp_tls_get(__libcpp_tls_key __key)
{
return pthread_getspecific(__key);
}
int __libcpp_tls_set(__libcpp_tls_key __key, void *__p)
{
return pthread_setspecific(__key, __p);
}
#endif // !_LIBCPP_HAS_THREAD_LIBRARY_EXTERNAL || _LIBCPP_BUILDING_THREAD_LIBRARY_EXTERNAL
class _LIBCPP_TYPE_VIS thread;
class _LIBCPP_TYPE_VIS __thread_id;
namespace this_thread
{
_LIBCPP_INLINE_VISIBILITY __thread_id get_id() _NOEXCEPT;
} // this_thread
template<> struct hash<__thread_id>;
class _LIBCPP_TEMPLATE_VIS __thread_id
{
// FIXME: pthread_t is a pointer on Darwin but a long on Linux.
// NULL is the no-thread value on Darwin. Someone needs to check
// on other platforms. We assume 0 works everywhere for now.
__libcpp_thread_id __id_;
public:
_LIBCPP_INLINE_VISIBILITY
__thread_id() _NOEXCEPT : __id_(0) {}
friend _LIBCPP_INLINE_VISIBILITY
bool operator==(__thread_id __x, __thread_id __y) _NOEXCEPT
{ // don't pass id==0 to underlying routines
if (__x.__id_ == 0) return __y.__id_ == 0;
if (__y.__id_ == 0) return false;
return __libcpp_thread_id_equal(__x.__id_, __y.__id_);
}
friend _LIBCPP_INLINE_VISIBILITY
bool operator!=(__thread_id __x, __thread_id __y) _NOEXCEPT
{return !(__x == __y);}
friend _LIBCPP_INLINE_VISIBILITY
bool operator< (__thread_id __x, __thread_id __y) _NOEXCEPT
{ // id==0 is always less than any other thread_id
if (__x.__id_ == 0) return __y.__id_ != 0;
if (__y.__id_ == 0) return false;
return __libcpp_thread_id_less(__x.__id_, __y.__id_);
}
friend _LIBCPP_INLINE_VISIBILITY
bool operator<=(__thread_id __x, __thread_id __y) _NOEXCEPT
{return !(__y < __x);}
friend _LIBCPP_INLINE_VISIBILITY
bool operator> (__thread_id __x, __thread_id __y) _NOEXCEPT
{return __y < __x ;}
friend _LIBCPP_INLINE_VISIBILITY
bool operator>=(__thread_id __x, __thread_id __y) _NOEXCEPT
{return !(__x < __y);}
_LIBCPP_INLINE_VISIBILITY
void __reset() { __id_ = 0; }
template<class _CharT, class _Traits>
friend
_LIBCPP_INLINE_VISIBILITY
basic_ostream<_CharT, _Traits>&
operator<<(basic_ostream<_CharT, _Traits>& __os, __thread_id __id);
private:
_LIBCPP_INLINE_VISIBILITY
__thread_id(__libcpp_thread_id __id) : __id_(__id) {}
friend __thread_id this_thread::get_id() _NOEXCEPT;
friend class _LIBCPP_TYPE_VIS thread;
friend struct _LIBCPP_TEMPLATE_VIS hash<__thread_id>;
};
namespace this_thread
{
inline _LIBCPP_INLINE_VISIBILITY
__thread_id
get_id() _NOEXCEPT
{
return __libcpp_thread_get_current_id();
}
} // this_thread
#endif // !_LIBCPP_HAS_NO_THREADS
_LIBCPP_END_NAMESPACE_STD
_LIBCPP_POP_MACROS
#endif // _LIBCPP_THREADING_SUPPORT