This change introduces a new deadlock detector for Cosmo's POSIX threads
implementation. Error check mutexes will now track a DAG of nested locks
and report EDEADLK when a deadlock is theoretically possible. These will
occur rarely, but it's important for production hardening your code. You
don't even need to change your mutexes to use the POSIX error check mode
because `cosmocc -mdbg` will enable error checking on mutexes by default
globally. When cycles are found, an error message showing your demangled
symbols describing the strongly connected component are printed and then
the SIGTRAP is raised, which means you'll also get a backtrace if you're
using ShowCrashReports() too. This new error checker is so low-level and
so pure that it's able to verify the relationships of every libc runtime
lock, including those locks upon which the mutex implementation depends.
Using callbacks is still problematic with cosmo_dlopen() due to the need
to restore the TLS register. So using callbacks is even more strict than
using signal handlers. We are better off introducing a cosmoaudio_poll()
function. It makes the API more UNIX-like. How bad could the latency be?
This change switches c++ exception handling from sjlj to standard dwarf.
It's needed because clang for aarch64 doesn't support sjlj. It turns out
that libunwind had a bare-metal configuration that made this easy to do.
This change gets the new experimental cosmocc -mclang flag in a state of
working so well that it can now be used to build all of llamafile and it
goes 3x faster in terms of build latency, without trading away any perf.
The int_fast16_t and int_fast32_t types are now always defined as 32-bit
in the interest of having more abi consistency between cosmocc -mgcc and
-mclang mode.
C++ code compiles very slowly with cosmocc, possibly because we're using
LLVM LIBCXX with GCC, and LLVM doesn't work as hard to make GCC go fast.
Therefore, it should be possible, to ask cosmocc to favor Clang over GCC
under the hood. On llamafile, my intention's to use this to make certain
files, e.g. llama.cpp/common.cpp, go from taking 17 seconds to 5 seconds
This new -mclang flag isn't ready for production yet since there's still
the question of how to get Clang to generate SJLJ exception code. If you
use this, then it's recommended you also pass -fno-exceptions.
The tradeoff is we're adding a 121mb binary to the cosmocc distribution.
There are no plans as of yet to fully migrate to Clang since GCC is very
good and has always treated us well.
This change implements the compiler runtime for ARM v8.1 ISE atomics and
gets rid of the mandatory -mno-outline-atomics flag. It can dramatically
speed things up, on newer ARM CPUs, as indicated by the changed lines in
test/libc/thread/footek_test.c. In llamafile dispatching on hwcap atomic
also shaved microseconds off synchronization barriers.
Our cosmocc binaries are now built with GCC 14.1, using the Cosmo commit
efb3a34608 from yesterday.
GCC is now configured using --enable-analyzer so you can use -fanalyzer.
The cosmocc.zip toolchain will now include four builds of the libcosmo.a
runtime libraries. You can pass the -mdbg flag if you want to debug your
cosmopolitan runtime. You can pass the -moptlinux flag if you don't want
windows code lurking in your binary. See tool/cosmocc/README.md for more
details on how these flags may be used and their important implications.
GCC 13+ changed its policies to be very aggressive about breaking builds
that have anything resembling K&R C. I very strongly disagree with these
decisions. Users who think their compiler should should also be a linter
are perfectly welcome to opt-in to -Wimplicit-int.
The Cosmopolitan Compiler Collection now includes the following programs
- `ar.ape` is a faster alternative to `ar rcsD` for creating determistic
static archives. It's ~10x faster than GNU because it isn't quadratic.
It'll even outperform LLVM ar by 2x, thanks to writev/copy_file_range.
- `sha256sum.ape` is a faster alternative to the `sha256sum` command. It
goes 2x faster since it leverages vectorized assembly implementations.
- `resymbol` is a brand new program we invented, like objcopy, that lets
you rename all the global symbols in a .o file to have a new suffix or
prefix. In the future, this will be used by cosmocc automatically when
building -O3 math kernels, that need to be vectorized for all hardware
- `gzip.ape` is a faster version of the `gzip` command, that is included
by most Linux distros. It gains better performance using Chromium Zlib
which, once again, includes highly optimized assembly, that Mark Adler
won't merge into the official MS-DOS compatible zlib codebase.
- `cocmd` is the cosmopolitan shell. It can function as a faster `sh -c`
alternative than bash and dash as the `SHELL = /opt/cosmocc/bin/cocmd`
at the top of your Makefile. Please note you should be using the cosmo
fork of GNU make (already included), since normal make won't recognize
this as a bourne-compatible shell and remove the execve() optimization
which makes things slower. In some ways that's true. This doesn't have
a complete POSIX shell implementation. However it's enough for cosmo's
mono repo. It also implements faster behaviors in some respects.
The following programs are also introduced, which aren't as interesting.
The main reason why they're here is so Cosmopolitan's mono repo shall be
able to remove build/bootstrap/ in future editions. That way we can keep
build utilities better up to date, without bloating the git history much
- `chmod.ape` for hermeticity
- `cp.ape` for hermeticity
- `echo.ape` for hermeticity
- `objbincopy` is an objcopy-like tool that's used to build ape loader
- `package.ape` is used for strict dependency checking of object graph
- `rm.ape` for hermeticity
- `touch.ape` for hermeticity
The way to use double linked lists, is to remove all the things you want
to work on, insert them into a new list on the stack. Then once you have
all the work items, you release the lock, do your work, and then lock it
again, to add the shelled out items back to a global freelist.
This fixes a regression in mmap(MAP_FIXED) on Windows caused by a recent
revision. This change also fixes ZipOS so it no longer needs a MAP_FIXED
mapping to open files from the PKZIP store. The memory mapping mutex was
implemented incorrectly earlier which meant that ftrace and strace could
cause cause crashes. This lock and other recursive mutexes are rewritten
so that it should be provable that recursive mutexes in cosmopolitan are
asynchronous signal safe.
We're now able to rewind the instruction pointer in x86 backtraces. This
helps ensure addr2line cannot print information about unrelated adjacent
code. I've restored -fno-schedule-insns2 in most cases because it really
does cause unpredictable breakage for backtraces.
This was a good idea back when we were only using it to build various
open source projects. However it no longer makes sense that many more
people are depending on cosmocc, to develop new software. Our tooling
shouldn't be making these kinds of decisions for the user.
MaGuess on Discord pointed out the fact that cosmocc contradicts itself
on the signedness of `char`. It's up to each platform to choose one, so
the cosmo platform shall choose signed. The rationale is it makes the C
language syntax more internally similar. `char` should be `signed char`
for the same reason `int` means `signed int`. It's recommended that you
still assume `char` could go either way since that's portable thinking.
But if you want to assume we'll always have signed char, that's ok too.
You can now run commands like `x86_64-unknown-cosmo-c++ -S` when using
your cosmocc toolchain. Please note the S flag isn't supported for the
cosmocc command itself.
The WIN32 CreateProcess() function does not require an .exe or .com
suffix in order to spawn an executable. Now that we have Cosmo bash
we're no longer so dependent on the cmd.exe prompt.