Our cosmocc binaries are now built with GCC 14.1, using the Cosmo commit
efb3a34608 from yesterday.
GCC is now configured using --enable-analyzer so you can use -fanalyzer.
This change solves an issue where many threads attempting to spawn forks
at once would cause fork() performance to degrade with the thread count.
Things got real nasty on NetBSD, which slowed down the whole test fleet,
because there's no vfork() and we're forced to use fork() in our server.
threads count task
1 1062 fork+exit+wait
2 668 fork+exit+wait
4 66 fork+exit+wait
8 19 fork+exit+wait
16 22 fork+exit+wait
32 16 fork+exit+wait
Things are now much less bad on NetBSD, but not great, since it does not
have futexes; we rely on its semaphore file descriptors to do conditions
threads count task
1 1085 fork+exit+wait
2 842 fork+exit+wait
4 532 fork+exit+wait
8 400 fork+exit+wait
16 276 fork+exit+wait
32 66 fork+exit+wait
With OpenBSD which also lacks vfork(), things were just as bad as NetBSD
threads count task
1 584 fork+exit+wait
2 687 fork+exit+wait
4 206 fork+exit+wait
8 24 fork+exit+wait
16 33 fork+exit+wait
32 26 fork+exit+wait
But since OpenBSD has futexes fork() works terrifically thanks to *NSYNC
threads count task
1 525 fork+exit+wait
2 580 fork+exit+wait
4 451 fork+exit+wait
8 479 fork+exit+wait
16 408 fork+exit+wait
32 373 fork+exit+wait
This issue would most likely only manifest itself, when pthread_atfork()
callers manage to slip a spin lock into the outermost position of fork's
list of locks. Since fork() is very slow, a spin lock can be devastating
Needless to say vfork() rules and anyone who says differently is kidding
themselves. Look at what a FreeBSD 14.1 virtual machine with equal specs
can do over the course of three hundred milliseconds.
threads count task
1 2559 vfork+exit+wait
2 5389 vfork+exit+wait
4 34933 vfork+exit+wait
8 43273 vfork+exit+wait
16 49648 vfork+exit+wait
32 40247 vfork+exit+wait
So it's a shame that so few OSes support vfork(). It creates an unsavory
situation, where someone wanting to build a server that spawns processes
would be better served to not use threads and favor a multiprocess model
The cosmocc.zip toolchain will now include four builds of the libcosmo.a
runtime libraries. You can pass the -mdbg flag if you want to debug your
cosmopolitan runtime. You can pass the -moptlinux flag if you don't want
windows code lurking in your binary. See tool/cosmocc/README.md for more
details on how these flags may be used and their important implications.
GCC 13+ changed its policies to be very aggressive about breaking builds
that have anything resembling K&R C. I very strongly disagree with these
decisions. Users who think their compiler should should also be a linter
are perfectly welcome to opt-in to -Wimplicit-int.
It's been thirteen years and C++ still hasn't implemented this wonderful
simple builtin keyword. In C++23 a solution was provided for making this
work in C++ which is libcxx's stdatomic.h. Including that header schleps
in literally 253 unique header files!! Many of the header files it needs
are libc header files like pthread.h where we need to have the _Atomic()
keyword, but since <atomic> depends on pthreads we can't have it include
the <stdatomic.h> header that defines _Atomic for C++ users, and instead
we simply make the type non-atomic, hoping and praying only C code shall
use those internal data structures. This just shows how STL clowns can't
be trusted to define the innermost primitives of a language. They should
instead be focusing on being the best at algorithms and data structures.
- NetBSD should now have faster synchronization
- POSIX barriers may now be shared across processes
- An edge case with memory map tracking has been fixed
- Grand Central Dispatch is no longer used on MacOS ARM64
- POSIX mutexes in normal mode now use futexes across processes
- CLOCK_THREAD_CPUTIME_ID
- CLOCK_PROCESS_CPUTIME_ID
Cosmo now supports the above constants universally across supported OSes
therefore it's now safe to let programs detect their presence w/ #ifdefs
Cosmopolitan Libc once called this important function although somewhere
along the way, possibly in a refactoring, it got removed and __tls_alloc
has always been zero ever since.