- More timspec_*() and timeval_*() APIs have been introduced.
- The copyfd() function is now simplified thanks to POSIX rules.
- More Cosmo-specific APIs have been moved behind the COSMO define.
- The setitimer() polyfill for Windows NT is now much higher quality.
- Fixed build error for MODE=aarch64 due to -mstringop-strategy=loop.
- This change introduces `make MODE=nox87 toolchain` which makes it
possible to build programs using your cosmocc toolchain that don't
have legacy fpu instructions. This is useful, for example, if you
want to have a ~22kb tinier blink virtual machine.
This change fixes an issue with the tcflow() magic numbers that was
causing bash to freeze up on Linux. While auditing termios polyfills,
several other issues were identified with XNU/BSD compatibility.
Out of an abundance of caution this change undefines as much surface
area from libc/calls/struct/termios.h as possible, so that autoconf
scripts are less likely to detect non-POSIX teletypewriter APIs that
haven't been polyfilled by Cosmopolitan.
This is a *breaking change* for your static archives in /opt/cosmos if
you use the cosmocc toolchain. That's because this change disables the
ioctl() undiamonding trick for code outside the monorepo, specifically
because it'll lead to brittle ABI breakages like this. If you're using
the cosmocc toolchain, you'll need to rebuild libraries like ncurses,
readline, etc. Yes diamonds cause bloat. To work around that, consider
using tcgetwinsize() instead of ioctl(TIOCGWINSZ) since it'll help you
avoid pulling every single ioctl-related polyfill into the linkage.
The cosmocc script was specifying -DNDEBUG for some reason. It's fixed.
This change takes an entirely new approach to the incremental linking of
pkzip executables. The assets created by zipobj.com are now treated like
debug data. After a .com.dbg is compiled, fixupobj.com should be run, so
it can apply fixups to the offsets and move the zip directory to the end
of the file. Since debug data doesn't get objcopy'd, a new tool has been
introduced called zipcopy.com which should be run after objcopy whenever
a .com file is created. This is all automated by the `cosmocc` toolchain
which is rapidly becoming the new recommended approach.
This change also introduces the new C23 checked arithmetic macros.
This change improves the way internal APIs are being hidden behind the
`COSMO` define. The cosmo.h header will take care of defining that, so
that a separate define statement isn't needed. This change also does a
lot more to define which APIs are standard, and which belong to Cosmo.
This change integrates e58abc1110b335a3341e8ad5821ad8e3880d9bb2 from
https://github.com/ahgamut/musl-cross-make/ which fixes the issues we
were having with our C language extension for symbolic constants. This
change also performs some code cleanup and bug fixes to getaddrinfo().
It's now possible to compile projects like ncurses, readline and python
without needing to patch anything upstream, except maybe a line or two.
Pretty soon it should be possible to build a Linux distro on Cosmo.
In order to improve our chances of success building other open source
projects we shouldn't define APIs that'll lead any ./configure script
astray. For example:
- brk() and sbrk() can break mac/windows support
- syscall() is a superb way to break portability
- arch_prctl() is the greatest of all horror shows
- Work towards improving non-optimized build support
- Introduce MODE=zero which is -O0 without ASAN/UBSAN
- Use system GCC when ~/.cosmo.mk has USE_SYSTEM_TOOLCHAIN=1
- Have package.com check .privileged code doesn't call non-privileged
- Get mprotect_test working on aarch64
- Get completion working on python.com repl again
- Improve quality of printvideo.com and printimage.com
- Fix bug in openpty() so examples/script.c works again
This change greatly reduces the number of modules that need to be
compiled. The only issue right now is that sometimes when viewing
symbol table entries, the aliased symbol is chosen.
This change implements a new approach to function call logging, that's
based on the GCC flag: -fpatchable-function-entry. Read the commentary
in build/config.mk to learn how it works.
- Now using 10x better GCD semaphores
- We now generate Linux-like thread ids
- We now use fast system clock / sleep libraries
- The APE M1 loader now generates Linux-like stacks
llama.com can now load weights that use the new file format which was
introduced a few weeks ago. Note that, unlike llama.cpp, we will keep
support for old file formats in our tool so you don't need to convert
your weights when the upstream project makes breaking changes. Please
note that using ggjt v3 does make avx2 inference go 5% faster for me.
This change progresses our AARCH64 support:
- The AARCH64 build and tests are now passing
- Add 128-bit floating-point support to printf()
- Fix clone() so it initializes cosmo's x28 TLS register
- Fix TLS memory layout issue with aarch64 _Alignas vars
- Revamp microbenchmarking tools so they work on aarch64
- Make some subtle improvements to aarch64 crash reporting
- Make kisdangerous() memory checks more accurate on aarch64
- Remove sys_open() since it's not available on Linux AARCH64
This change makes general improvements to Cosmo and Redbean:
- Introduce GetHostIsa() function in Redbean
- You can now feature check using pledge(0, 0)
- You can now feature check using unveil("",0)
- Refactor some more x86-specific asm comments
- Refactor and write docs for some libm functions
- Make the mmap() API behave more similar to Linux
- Fix WIFSIGNALED() which wrongly returned true for zero
- Rename some obscure cosmo keywords from noFOO to dontFOO
There's a new program named ape/ape-m1.c which will be used to build an
embeddable binary that can load ape and elf executables. The support is
mostly working so far, but still chasing down ABI issues.
- Fix UX issues with llama.com
- Do housekeeping on libm code
- Add more vectorization to GGML
- Get GGJT quantizer programs working well
- Have the quantizer keep the output layer as f16c
- Prefetching improves performance 15% if you use fewer threads
- Perform some housekeeping on scalar math function code
- Import ARM's Optimized Routines for SIMD string processing
- Upgrade to latest Chromium zlib and enable more SIMD optimizations
- Introduce epoll_pwait()
- Rewrite -ftrapv and ffs() libraries in C code
- Use more FreeBSD code in math function library
- Get significantly more tests passing on qemu-aarch64
- Fix many Musl long double functions that were broken on AARCH64
make -j8 o//third_party/radpajama/radpajama.com
make -j8 o//third_party/radpajama/radpajama-chat.com
This change gets the radpajama.mk config working. This package depends
on THIRD_PARTY_GGML but it's configured to call ggjt_v1(), so that the
library will provide the old quantizers. The ggml_quantize_chunk() API
will now dispatch to older quantizers based on the configured version.
This change makes quantized models (e.g. q4_0) go 10% faster on Macs
however doesn't offer much improvement for Intel PC hardware.
This change syncs llama.cpp 699b1ad7fe6f7b9e41d3cb41e61a8cc3ea5fc6b5
which recently made a breaking change to nearly all its file formats
without any migration. Since that'll break hundreds upon hundreds of
models on websites like HuggingFace llama.com will support both file
formats because llama.com will never ever break the GGJT file format
- Utilities like pledge.com now build
- kprintf() will no longer balk at 48-bit addresses
- There's a new aarch64-dbg build mode that should work
- gc() and defer() are mostly pacified; avoid using them on aarch64
- THIRD_PART_STB now has Arm Neon intrinsics for fast image handling
Example use case for JSON completion:
$ m=opt
$ make -j16 m=$m o/$m/third_party/ggml/llama.com
$ o/$m/third_party/ggml/llama.com -m llama.bin -p '{"key": "life", "val": ' -r '}'
42}
This provides better control. More sophisticated facilities for
controlling text generation will be provided soon enough.
We can once again create 2mb statically-linked Python binaries:
$ make -j8 m=tiny o/tiny/examples/pyapp/pyapp.com
$ ls -hal o/tiny/examples/pyapp/pyapp.com
-rwxr-xr-x 1 jart jart 2.1M May 1 14:04 o/tiny/examples/pyapp/pyapp.com
$ o/tiny/examples/pyapp/pyapp.com
cosmopolitan is cool!
The regression was caused by Python thread support in b15f9eb58
- Introduce -v and --verbose flags
- Don't print stats / diagnostics unless -v is passed
- Reduce --top_p default from 0.95 to 0.70
- Change --reverse-prompt to no longer imply --interactive
- Permit --reverse-prompt specifying custom EOS if non-interactive
Python threads are now generally working, however some parts of Python's
regression tests for threads are flaky. This is possibly due to needing
more locking primitives in Cosmo's IO system call wrappers, e.g. close.
make o//third_party/python/Lib/test/test_threading.py.runs
See #747
- enable WITH_THREAD and _POSIX_THREADS
- add headers everywhere
- breaks only two tests (faulthandler and signal)
- disabled terminal completion because it causes segfaults for some
reason (probably could not get the current thread)
- Improve compatibility with Blink virtual machine
- Add non-POSIX APIs for joining threads and signal masks
- Never ever use anything except 32-bit integers for atomics
- Add some `#undef` statements to workaround `ctags` problems
libc/integral changes aren't checked in the build dependency, due to
being explicitly listed in .UNVEIL, which is how this breakage ended
up accidentally slipping through the cracks.
- clock_nanosleep() is now much faster on OpenBSD and NetBSD
- Thread joining is now much faster on NetBSD
- FreeBSD timestamps are now more accurate
- Thread spawning now goes faster on XNU
- Clean up the clone() code
- Clean up sigaction() code
- Add a port scanner example
- Introduce a ParseCidr() API
- Clean up our futex abstraction code
- Fix a harmless integer overflow in ParseIp()
- Use kernel semaphores on NetBSD to make threads much faster
- Exhaustively document cancellation points
- Rename SIGCANCEL to SIGTHR just like BSDs
- Further improve POSIX thread cancellations
- Ensure asynchronous cancellations work correctly
- Elevate the quality of getrandom() and getentropy()
- Make futexes cancel correctly on OpenBSD 6.x and 7.x
- Add reboot.com and shutdown.com to examples directory
- Remove underscore prefix from awesome timespec_*() APIs
- Create assertions that help verify our cancellation points
- Remove bad timespec APIs (cmp generalizes eq/ne/gt/gte/lt/lte)
This change makes some miracle modifications to the System Five system
call support, which lets us have safe, correct, and atomic handling of
thread cancellations. It all turned out to be cheaper than anticipated
because it wasn't necessary to modify the system call veneers. We were
able to encode the cancellability of each system call into the magnums
found in libc/sysv/syscalls.sh. Since cancellations are so waq, we are
also supporting a lovely Musl Libc mask feature for raising ECANCELED.
- ASAN memory morgue is now lockless
- Make C11 atomics header more portable
- Rewrote pthread keys support to be lockless
- Simplify Python's unicode table unpacking code
- Make crash report write(2) closer to being atomic
- Make it possible to strace/ftrace a single thread
- ASAN now checks nul-terminated strings fast and properly
- Windows fork() now restores TLS memory of calling thread
- Invent iso8601us() for faster timestamps
- Improve --strace descriptions of sigset_t
- Rebuild the Landlock Make bootstrap binary
- Introduce MODE=sysv for non-Windows builds
- Permit OFD fcntl() locks under pledge(flock)
- redbean can now protect your kernel from ddos
- Have vfork() fallback to sys_fork() not fork()
- Change kmalloc() to not die when out of memory
- Improve documentation for some termios functions
- Rewrite putenv() and friends to conform to POSIX
- Fix linenoise + strace verbosity issue on Windows
- Fix regressions in our ability to show backtraces
- Change redbean SetHeader() to no-op if value is nil
- Improve fcntl() so SQLite locks work in non-WAL mode
- Remove some unnecessary work during fork() on Windows
- Create redbean-based SSL reverse proxy for IPv4 TurfWar
- Fix ape/apeinstall.sh warning when using non-bash shells
- Add ProgramTrustedIp(), and IsTrustedIp() APIs to redbean
- Support $PWD, $UID, $GID, and $EUID in command interpreter
- Introduce experimental JTqFpD APE prefix for non-Windows builds
- Invent blackhole daemon for firewalling IP addresses via UNIX named socket
- Add ProgramTokenBucket(), AcquireToken(), and CountTokens() APIs to redbean
If threads are being used, then fork() will now acquire and release and
runtime locks so that fork() may be safely used from threads. This also
makes vfork() thread safe, because pthread mutexes will do nothing when
the process is a child of vfork(). More torture tests have been written
to confirm this all works like a charm. Additionally:
- Invent hexpcpy() api
- Rename nsync_malloc_() to kmalloc()
- Complete posix named semaphore implementation
- Make pthread_create() asynchronous signal safe
- Add rm, rmdir, and touch to command interpreter builtins
- Invent sigisprecious() and modify sigset functions to use it
- Add unit tests for posix_spawn() attributes and fix its bugs
One unresolved problem is the reclaiming of *NSYNC waiter memory in the
forked child processes, within apps which have threads waiting on locks
This lets our system() and popen() commands function sort of like
BusyBox and ToyBox. By default the Cosmopolitan Shell is lightweight.
But if you use STATIC_YOINK then you can pull the individual commands
you want into the linkage, and they'll be included in a single binary.
For example the demo binary embeds `tr` and `sed` and ends up ~140kb.
- SQLite file locking now works on Windows
- SQLite will now use fdatasync() on non-Apple platforms
- Fix Ctrl-C handler on Windows to not crash with TLS
- Signals now work in multithreaded apps on Windows
- fcntl() will now accurately report EINVAL errors
- fcntl() now has excellent --strace logging
- Token bucket replenish now go 100x faster
- *NSYNC cancellations now work on Windows
- Support closefrom() on NetBSD
The cosmopolitan command interpreter now has 13 builtin commands,
variable support, support for ; / && / || syntax, asynchronous support,
and plenty of unit tests with bug fixes.
This change fixes a bug in posix_spawn() with null envp arg. strace
logging now uses atomic writes for scatter functions. Breaking change
renaming GetCpuCount() to _getcpucount(). TurfWar is now updated to use
the new token bucket algorithm. WIN32 affinity masks now inherit across
fork() and execve().
This change addresses various open source compatibility issues, so that
we pass 313/411 of the tests in https://github.com/jart/libc-test where
earlier today we were passing about 30/411 of them, due to header toil.
Please note that Glibc only passes 341/411 so 313 today is pretty good!
- Make the conformance of libc/isystem/ headers nearly perfect
- Import more of the remaining math library routines from Musl
- Fix inconsistencies with type signatures of calls like umask
- Write tests for getpriority/setpriority which work great now
- conform to `struct sockaddr *` on remaining socket functions
- Import a bunch of uninteresting stdlib functions e.g. rand48
- Introduce readdir_r, scandir, pthread_kill, sigsetjmp, etc..
Follow the instructions in our `tool/scripts/cosmocc` toolchain to run
these tests yourself. You use `make CC=cosmocc` on the test repository
- Change IDT code so kprintf() isn't mandatory dependency
- Document current intentions around pthread_cancel()
- Make _npassert() an _unassert() in MODE=tiny
You can now do things like implement mutexes using futexes in your
redbean lua code. This provides the fastest possible inter-process
communication for your production systems when SQLite alone as ipc
or things like pipes aren't sufficient.
* Proof of concept of sqlite serialization
This is a minimal proof of concept in order to show that it is easily possible to store the sqlite database within the zip file itself not requiring creating an external file first. Changes include compiling the sqlite library with the serialization flag, adding serialize/deserialize to the lua sqlite library and demonstrating the work via the redbean demo.
* Change demo for sqlite serialization
As explained in https://github.com/jart/cosmopolitan/pull/436#issuecomment-1164706893 the original use case is not possible with sqlite serialization, as an in-memory database cannot be shared across multiple processes. Thereby, this use case simply creates a backup of the in-memory database created in '.init.lua' and loads it to do a query.
* Fix sqlite3_deserialize parameters
The call to the sqlite3 library for the deserilization wasn't fully correct. This should fix the size parameters.
- Shutdown process now has optimal cancellation latency
- Fairer techniques for shedding connections under load
- We no longer need to call poll() which is now removed
It can now handle 240k SQLite write QPS at 3ms 99 percentile latency.
We're still working out the kinks since it's brand new. But we've got
this running in production already!
This change reduces the .bss memory requirement for all executables by
O(64kb). The brk system calls are now fully tested and figured out and
might be useful for tiny programs that only target System Five.
This change improves copy_file_range(), sendfile(), splice(), openpty(),
closefrom(), close_range(), fadvise() and posix_fadvise() in addition to
writing tests that confirm things like errno and seeking behavior across
platforms. We now less aggressively polyfill behavior with some of these
functions when the platform support isn't available. Please see:
https://justine.lol/cosmopolitan/functions.html
This change upgrades to the latest Chromium Zlib, fixes bugs in redbean,
and introduces better support for reverse proxies like Cloudflare. This
change improves the security of redbean and it's recommended that users
upgrade to the release that'll follow. This change also updates the docs
to clarify how to use the security tools redbean provides e.g. pledge(),
unveil(), and the MODE=asan builds which improve memory safety.
Doing this makes binaries tinier, since we don't need to have all the
extra code for supporting a 32-bit address space. It also benefits us
because we're able to use WIN32 futexes, which makes locking simpler.
b69f3d2488 is what officially ended our
Windows 7 support. This change is merely a formalization. You can use
old versions of Cosmo now and forevermore if you need Windows 7 since
our repository is hermetic and vendors all its dependencies.
Won't fix#617
- Fix preadv() and pwritev() for old distros
- Introduce _npassert() and _unassert() macros
- Prove that file locks work properly on Windows
- Support fcntl(F_DUPFD_CLOEXEC) on more systems
This makes breaking changes to add underscores to many non-standard
function names provided by the c library. MODE=tiny is now tinier and
we now use smaller locks that are better for tiny apps in this mode.
Some headers have been renamed to be in the same folder as the build
package, so it'll be easier to know which build dependency is needed.
Certain old misguided interfaces have been removed. Intel intrinsics
headers are now listed in libc/isystem (but not in the amalgamation)
to help further improve open source compatibility. Header complexity
has also been reduced. Lastly, more shell scripts are now available.
The organization of the source files is now much more rational.
Old experiments that didn't work out are now deleted. Naming of
things like files is now more intuitive.
This change tunes the default stack size for the outside world to 8mb
while at the same time, reducing Cosmopolitan's default stack size to
64kb. You can override the stack size using STATIC_STACK_SIZE(). Your
build scripts should point to o//ape/public/ape.lds
This change also fixes the definition of SOMAXCONN and removes AF_RDS
since it's not polyfilled and Python 3.11 complained.
- You can now use _gc(malloc()) in multithreaded programs
- This change fixes a bug where fork() on NT disabled TLS
- Fixed TLS code morphing on XNU/NT, for R8-R15 registers
It now works most excellently across all supported operating
sytsems (earlier it didn't work on NT and XNU). Demo code is
available in examples/clock.c and this change also adds some
of the newer ANSI C time functions like timespec_get(), plus
timespec_getres() which hasn't even come out yet as it's C23
pthread_mutex_lock() now uses a better algorithm which goes much faster
in multithreaded environments that have lock contention. This comes at
the cost of adding some fixed-cost overhead to mutex invocations. That
doesn't matter for Cosmopolitan because our core libraries all encode
locking operations as NOP instructions when in single-threaded mode.
Overhead only applies starting the moment you first call clone().
This change restores the .symtab symbol table files in our flagship
programs (e.g. redbean.com, python.com) needed to show backtraces. This
also rolls back earlier changes to zip.com w.r.t. temp directories since
the right way to do it turned out to be the -b DIR flag.
This change also improves the performance of zip.com. It turned out
mmap() wasn't being used, because zip.com was assuming a 4096-byte
granularity, but cosmo requires 65536. There was also a chance to speed
up stdio scanning using the unlocked functions.