- Perform some housekeeping on scalar math function code
- Import ARM's Optimized Routines for SIMD string processing
- Upgrade to latest Chromium zlib and enable more SIMD optimizations
- Introduce epoll_pwait()
- Rewrite -ftrapv and ffs() libraries in C code
- Use more FreeBSD code in math function library
- Get significantly more tests passing on qemu-aarch64
- Fix many Musl long double functions that were broken on AARCH64
make -j8 o//third_party/radpajama/radpajama.com
make -j8 o//third_party/radpajama/radpajama-chat.com
This change gets the radpajama.mk config working. This package depends
on THIRD_PARTY_GGML but it's configured to call ggjt_v1(), so that the
library will provide the old quantizers. The ggml_quantize_chunk() API
will now dispatch to older quantizers based on the configured version.
This change makes quantized models (e.g. q4_0) go 10% faster on Macs
however doesn't offer much improvement for Intel PC hardware.
This change syncs llama.cpp 699b1ad7fe6f7b9e41d3cb41e61a8cc3ea5fc6b5
which recently made a breaking change to nearly all its file formats
without any migration. Since that'll break hundreds upon hundreds of
models on websites like HuggingFace llama.com will support both file
formats because llama.com will never ever break the GGJT file format
- Utilities like pledge.com now build
- kprintf() will no longer balk at 48-bit addresses
- There's a new aarch64-dbg build mode that should work
- gc() and defer() are mostly pacified; avoid using them on aarch64
- THIRD_PART_STB now has Arm Neon intrinsics for fast image handling
Example use case for JSON completion:
$ m=opt
$ make -j16 m=$m o/$m/third_party/ggml/llama.com
$ o/$m/third_party/ggml/llama.com -m llama.bin -p '{"key": "life", "val": ' -r '}'
42}
This provides better control. More sophisticated facilities for
controlling text generation will be provided soon enough.
We can once again create 2mb statically-linked Python binaries:
$ make -j8 m=tiny o/tiny/examples/pyapp/pyapp.com
$ ls -hal o/tiny/examples/pyapp/pyapp.com
-rwxr-xr-x 1 jart jart 2.1M May 1 14:04 o/tiny/examples/pyapp/pyapp.com
$ o/tiny/examples/pyapp/pyapp.com
cosmopolitan is cool!
The regression was caused by Python thread support in b15f9eb58
- Introduce -v and --verbose flags
- Don't print stats / diagnostics unless -v is passed
- Reduce --top_p default from 0.95 to 0.70
- Change --reverse-prompt to no longer imply --interactive
- Permit --reverse-prompt specifying custom EOS if non-interactive