- Every unit test now passes on Apple Silicon. The final piece of this
puzzle was porting our POSIX threads cancelation support, since that
works differently on ARM64 XNU vs. AMD64. Our semaphore support on
Apple Silicon is also superior now compared to AMD64, thanks to the
grand central dispatch library which lets *NSYNC locks go faster.
- The Cosmopolitan runtime is now more stable, particularly on Windows.
To do this, thread local storage is mandatory at all runtime levels,
and the innermost packages of the C library is no longer being built
using ASAN. TLS is being bootstrapped with a 128-byte TIB during the
process startup phase, and then later on the runtime re-allocates it
either statically or dynamically to support code using _Thread_local.
fork() and execve() now do a better job cooperating with threads. We
can now check how much stack memory is left in the process or thread
when functions like kprintf() / execve() etc. call alloca(), so that
ENOMEM can be raised, reduce a buffer size, or just print a warning.
- POSIX signal emulation is now implemented the same way kernels do it
with pthread_kill() and raise(). Any thread can interrupt any other
thread, regardless of what it's doing. If it's blocked on read/write
then the killer thread will cancel its i/o operation so that EINTR can
be returned in the mark thread immediately. If it's doing a tight CPU
bound operation, then that's also interrupted by the signal delivery.
Signal delivery works now by suspending a thread and pushing context
data structures onto its stack, and redirecting its execution to a
trampoline function, which calls SetThreadContext(GetCurrentThread())
when it's done.
- We're now doing a better job managing locks and handles. On NetBSD we
now close semaphore file descriptors in forked children. Semaphores on
Windows can now be canceled immediately, which means mutexes/condition
variables will now go faster. Apple Silicon semaphores can be canceled
too. We're now using Apple's pthread_yield() funciton. Apple _nocancel
syscalls are now used on XNU when appropriate to ensure pthread_cancel
requests aren't lost. The MbedTLS library has been updated to support
POSIX thread cancelations. See tool/build/runitd.c for an example of
how it can be used for production multi-threaded tls servers. Handles
on Windows now leak less often across processes. All i/o operations on
Windows are now overlapped, which means file pointers can no longer be
inherited across dup() and fork() for the time being.
- We now spawn a thread on Windows to deliver SIGCHLD and wakeup wait4()
which means, for example, that posix_spawn() now goes 3x faster. POSIX
spawn is also now more correct. Like Musl, it's now able to report the
failure code of execve() via a pipe although our approach favors using
shared memory to do that on systems that have a true vfork() function.
- We now spawn a thread to deliver SIGALRM to threads when setitimer()
is used. This enables the most precise wakeups the OS makes possible.
- The Cosmopolitan runtime now uses less memory. On NetBSD for example,
it turned out the kernel would actually commit the PT_GNU_STACK size
which caused RSS to be 6mb for every process. Now it's down to ~4kb.
On Apple Silicon, we reduce the mandatory upstream thread size to the
smallest possible size to reduce the memory overhead of Cosmo threads.
The examples directory has a program called greenbean which can spawn
a web server on Linux with 10,000 worker threads and have the memory
usage of the process be ~77mb. The 1024 byte overhead of POSIX-style
thread-local storage is now optional; it won't be allocated until the
pthread_setspecific/getspecific functions are called. On Windows, the
threads that get spawned which are internal to the libc implementation
use reserve rather than commit memory, which shaves a few hundred kb.
- sigaltstack() is now supported on Windows, however it's currently not
able to be used to handle stack overflows, since crash signals are
still generated by WIN32. However the crash handler will still switch
to the alt stack, which is helpful in environments with tiny threads.
- Test binaries are now smaller. Many of the mandatory dependencies of
the test runner have been removed. This ensures many programs can do a
better job only linking the the thing they're testing. This caused the
test binaries for LIBC_FMT for example, to decrease from 200kb to 50kb
- long double is no longer used in the implementation details of libc,
except in the APIs that define it. The old code that used long double
for time (instead of struct timespec) has now been thoroughly removed.
- ShowCrashReports() is now much tinier in MODE=tiny. Instead of doing
backtraces itself, it'll just print a command you can run on the shell
using our new `cosmoaddr2line` program to view the backtrace.
- Crash report signal handling now works in a much better way. Instead
of terminating the process, it now relies on SA_RESETHAND so that the
default SIG_IGN behavior can terminate the process if necessary.
- Our pledge() functionality has now been fully ported to AARCH64 Linux.
This change fixes bugs in the APE loader. The execve() unit tests are
now enabled for MODE=aarch64. See the README for how you need to have
binfmt_misc configured with Qemu to run them. Apple Silicon bugs have
been fixed too, e.g. tkill() now works.
make all test CC=fatcosmocc AR='fatcosmoar rcu'
This change introduces a program named mktemper.com which provides more
reliable and secure temporary file name generation for scripts. It also
makes our ar.com program more permissive in what commands it'll accept.
The cosmocc command is improved by this change too.
This way complex runtime features (e.g. ftrace, symbol tables) can
always yoink zipos support. This is important now that apelink.com
automates embedding symbol tables for multiple cpus.
- Found some bugs in LLVM compiler-rt library
- The useless LIBC_STUBS package is now deleted
- Improve the overflow checking story even further
- Get chibicc tests working in MODE=dbg mode again
- The libc/isystem/ headers now have correctly named guards
In order to improve our chances of success building other open source
projects we shouldn't define APIs that'll lead any ./configure script
astray. For example:
- brk() and sbrk() can break mac/windows support
- syscall() is a superb way to break portability
- arch_prctl() is the greatest of all horror shows
- Utilities like pledge.com now build
- kprintf() will no longer balk at 48-bit addresses
- There's a new aarch64-dbg build mode that should work
- gc() and defer() are mostly pacified; avoid using them on aarch64
- THIRD_PART_STB now has Arm Neon intrinsics for fast image handling
This change also removes the futimens() call on the Landlock Make output
file workaround, since it caused problems with commands like fixupobj
which modify-in-place. It turns out if a file is opened for writing and
then no writes actually occur, then the modified time doesn't change.
- 10.5% reduction of o//depend dependency graph
- 8.8% reduction in latency of make command
- Fix issue with temporary file cleanup
There's a new -w option in compile.com that turns off the recent
Landlock output path workaround for "good commands" which do not
unlink() the output file like GNU tooling does.
Our new GNU Make unveil sandboxing appears to have zero overhead
in the grand scheme of things. Full builds are pretty fast since
the only thing that's actually slowed us down is probably libcxx
make -j16 MODE=rel
RL: took 85,732,063µs wall time
RL: ballooned to 323,612kb in size
RL: needed 828,560,521µs cpu (11% kernel)
RL: caused 39,080,670 page faults (99% memcpy)
RL: 350,073 context switches (72% consensual)
RL: performed 0 reads and 11,494,960 write i/o operations
pledge() and unveil() no longer consider ENOSYS to be an error.
These functions have also been added to Python's cosmo module.
This change also removes some WIN32 APIs and System Five magnums
which we're not using and it's doubtful anyone else would be too
The whole repository is now buildable with GNU Make Landlock sandboxing.
This proves that no Makefile targets exist which touch files other than
their declared prerequisites. In order to do this, we had to:
1. Stop code morphing GCC output in package.com and instead run a
newly introduced FIXUPOBJ.COM command after GCC invocations.
2. Disable all the crumby Python unit tests that do things like create
files in the current directory, or rename() files between folders.
This ended up being a lot of tests, but most of them are still ok.
3. Introduce an .UNSANDBOXED variable to GNU Make to disable Landlock.
We currently only do this for things like `make tags`.
4. This change deletes some GNU Make code that was preventing the
execve() optimization from working. This means it should no longer
be necessary in most cases for command invocations to be indirected
through the cocmd interpreter.
5. Missing dependencies had to be declared in certain places, in cases
where they couldn't be automatically determined by MKDEPS.COM
6. The libcxx header situation has finally been tamed. One of the
things that makes this difficult is MKDEPS.COM only wants to
consider the first 64kb of a file, in order to go fast. But libcxx
likes to have #include lines buried after huge documentation.
7. An .UNVEIL variable has been introduced to GNU Make just in case
we ever wish to explicitly specify additional things that need to
be whitelisted which aren't strictly prerequisites. This works in
a manner similar to the recently introduced .EXTRA_PREREQS feature.
There's now a new build/bootstrap/make.com prebuilt binary available. It
should no longer be possible to write invalid Makefile code.
The pledge.com command now supports the new [WIP] unveil() support. For
example, to strongly sandbox our command for listing directories.
o//tool/build/assimilate.com o//examples/ls.com
pledge.com -v /etc -p 'stdio rpath' o//examples/ls.com /etc
This file system sandboxing is going to be perfect for us, because APE
binaries are self-contained static executables that really don't use the
filesystem that much. On the other hand, with non-static executables,
sandboxing is going to be more difficult. For example, here's how to
sandbox the `ls` command on the latest Alpine:
pledge.com -v rx:/lib -v /usr/lib -v /etc -p 'stdio rpath exec' ls /etc
This change fixes the `execpromises` API with pledge().
This change also adds unix.unveil() to redbean.
Fixes#494
- Fix some minor issues in ar.com
- Have execve() look for `ape` command
- Rewrite NT paths using /c/ rather /??/c:/
- Replace broken GCC symlinks with .sym files
- Rewrite $PATH environment variables on startup
- Make $(APE_NO_MODIFY_SELF) the default bootloader
- Add all build command dependencies to build/bootstrap
- Get the repository mostly building from source on non-Linux
- Add GetCpuCount() API to redbean
- Add unix.gmtime() API to redbean
- Add unix.readlink() API to redbean
- Add unix.localtime() API to redbean
- Perfect the new redbean UNIX module APIs
- Integrate with Linux clock_gettime() vDSO
- Run Lua garbage collector when malloc() fails
- Fix another regression quirk with linenoise repl
- Fix GetProgramExecutableName() for systemwide installs
- Fix a build flake with test/libc/mem/test.mk SRCS list
You can now use the hardest fastest and most dangerous language there is
with Cosmopolitan. So far about 75% of LLVM libcxx has been added. A few
breaking changes needed to be made to help this go smoothly.
- Rename nothrow to dontthrow
- Rename nodiscard to dontdiscard
- Add some libm functions, e.g. lgamma, nan, etc.
- Change intmax_t from int128 to int64 like everything else
- Introduce %jjd formatting directive for int128_t
- Introduce strtoi128(), strtou128(), etc.
- Rename bsrmax() to bsr128()
Some of the templates that should be working currently are std::vector,
std::string, std::map, std::set, std::deque, etc.
This commit makes numerous refinements to cosmopolitan memory handling.
The default stack size has been reduced from 2mb to 128kb. A new macro
is now provided so you can easily reconfigure the stack size to be any
value you want. Work around the breaking change by adding to your main:
STATIC_STACK_SIZE(0x00200000); // 2mb stack
If you're not sure how much stack you need, then you can use:
STATIC_YOINK("stack_usage_logging");
After which you can `sort -nr o/$MODE/stack.log`. Based on the unit test
suite, nothing in the Cosmopolitan repository (except for Python) needs
a stack size greater than 30kb. There are also new macros for detecting
the size and address of the stack at runtime, e.g. GetStackAddr(). We
also now support sigaltstack() so if you want to see nice looking crash
reports whenever a stack overflow happens, you can put this in main():
ShowCrashReports();
Under `make MODE=dbg` and `make MODE=asan` the unit testing framework
will now automatically print backtraces of memory allocations when
things like memory leaks happen. Bugs are now fixed in ASAN global
variable overrun detection. The memtrack and asan runtimes also handle
edge cases now. The new tools helped to identify a few memory leaks,
which are fixed by this change.
This change should fix an issue reported in #288 with ARG_MAX limits.
Fixing this doubled the performance of MKDEPS.COM and AR.COM yet again.
- python now mixes audio 10x faster
- python octal notation is restored
- chibicc now builds code 3x faster
- chibicc now has help documentation
- chibicc can now generate basic python bindings
- linenoise now supports some paredit-like features
See #141
It turned out that the linker was doing the wrong with the amalgamation
library concerning weak stubs. A regression test has been added and new
binaries have been uploaded to https://justine.lol/cosmopolitan/
Ideally this should be fixed by building a tool that turns multiple .a
files into a single .a file with deduplication. As a workaround for now
the cosmopolitan.a build is restructured to not include LIBC_STUBS which
meant technical debt needed to be paid off where non-stub interfaces
were moved to LIBC_INTRIN and LIBC_NEXGEN32E.
Thank @PerfectProductions in #31 for the report!
I wanted a tiny scriptable meltdown proof way to run userspace programs
and visualize how program execution impacts memory. It helps to explain
how things like Actually Portable Executable works. It can show you how
the GCC generated code is going about manipulating matrices and more. I
didn't feel fully comfortable with Qemu and Bochs because I'm not smart
enough to understand them. I wanted something like gVisor but with much
stronger levels of assurances. I wanted a single binary that'll run, on
all major operating systems with an embedded GPL barrier ZIP filesystem
that is tiny enough to transpile to JavaScript and run in browsers too.
https://justine.storage.googleapis.com/emulator625.mp4