cosmopolitan

mirror of https://github.com/jart/cosmopolitan.git synced 2025-01-31 03:27:39 +00:00

Author	SHA1	Message	Date
Justine Tunney	7512318a2a	Fix MODE=aarch64 build	2023-06-08 05:17:37 -07:00
Justine Tunney	daf4454a06	Validate privileged code relationships - Work towards improving non-optimized build support - Introduce MODE=zero which is -O0 without ASAN/UBSAN - Use system GCC when ~/.cosmo.mk has USE_SYSTEM_TOOLCHAIN=1 - Have package.com check .privileged code doesn't call non-privileged	2023-06-08 04:38:06 -07:00
Justine Tunney	b6182db813	Simplify ftrace_hook() We now have a test to prove that its transitive closure doesn't perform floating point computations.	2023-06-06 11:10:38 -07:00
Justine Tunney	61b9677c05	Make improvements - Get mprotect_test working on aarch64 - Get completion working on python.com repl again - Improve quality of printvideo.com and printimage.com - Fix bug in openpty() so examples/script.c works again	2023-06-06 09:12:30 -07:00
Justine Tunney	b94b29d79c	Prevent ftrace from misaligning functions	2023-06-06 06:00:31 -07:00
Justine Tunney	b8a6a989c0	Create ELF aliases for identical symbols This change greatly reduces the number of modules that need to be compiled. The only issue right now is that sometimes when viewing symbol table entries, the aliased symbol is chosen.	2023-06-06 03:33:49 -07:00
Justine Tunney	eb40cb371d	Get --ftrace working on aarch64 This change implements a new approach to function call logging, that's based on the GCC flag: -fpatchable-function-entry. Read the commentary in build/config.mk to learn how it works.	2023-06-05 23:35:31 -07:00
Justine Tunney	5b908bc756	Fix some build errors	2023-06-05 15:53:44 -07:00
Justine Tunney	9cc3e37263	Upgrade to Cosmopolitan GCC 11.2.0 for aarch64	2023-06-05 02:07:28 -07:00
Justine Tunney	39f20dbb13	Upgrade to Cosmopolitan GCC 11.2.0 for x86_64	2023-06-05 02:06:18 -07:00
Justine Tunney	fc34ba2596	Fix Linenoise REPL on AARCH64	2023-06-04 02:57:17 -07:00
Justine Tunney	bcf9af94bf	Get threads working well on MacOS Arm64 - Now using 10x better GCD semaphores - We now generate Linux-like thread ids - We now use fast system clock / sleep libraries - The APE M1 loader now generates Linux-like stacks	2023-06-04 01:57:10 -07:00
Justine Tunney	b5eab2b0b7	Get POSIX threads working on Apple Silicon It's now possible to run a working ape-m1 o/aarch64/third_party/ggml/llama.com on Apple M1 hardware running XNU!	2023-06-03 18:33:01 -07:00
Justine Tunney	8fdb31681a	Introduce support for GGJT v3 file format llama.com can now load weights that use the new file format which was introduced a few weeks ago. Note that, unlike llama.cpp, we will keep support for old file formats in our tool so you don't need to convert your weights when the upstream project makes breaking changes. Please note that using ggjt v3 does make avx2 inference go 5% faster for me.	2023-06-03 15:46:21 -07:00
Justine Tunney	1904a3cae8	Sync llama.cpp to 6986c7835adc13ba3f9d933b95671bb1f3984dc6	2023-06-03 10:29:12 -07:00
Justine Tunney	8f522cb702	Make improvements This change progresses our AARCH64 support: - The AARCH64 build and tests are now passing - Add 128-bit floating-point support to printf() - Fix clone() so it initializes cosmo's x28 TLS register - Fix TLS memory layout issue with aarch64 _Alignas vars - Revamp microbenchmarking tools so they work on aarch64 - Make some subtle improvements to aarch64 crash reporting - Make kisdangerous() memory checks more accurate on aarch64 - Remove sys_open() since it's not available on Linux AARCH64 This change makes general improvements to Cosmo and Redbean: - Introduce GetHostIsa() function in Redbean - You can now feature check using pledge(0, 0) - You can now feature check using unveil("",0) - Refactor some more x86-specific asm comments - Refactor and write docs for some libm functions - Make the mmap() API behave more similar to Linux - Fix WIFSIGNALED() which wrongly returned true for zero - Rename some obscure cosmo keywords from noFOO to dontFOO	2023-06-03 08:12:22 -07:00
Justine Tunney	1422e96b4e	Introduce native support for MacOS ARM64 There's a new program named ape/ape-m1.c which will be used to build an embeddable binary that can load ape and elf executables. The support is mostly working so far, but still chasing down ABI issues.	2023-05-20 04:17:03 -07:00
Justine Tunney	e7eb0b3070	Make more ML improvements - Fix UX issues with llama.com - Do housekeeping on libm code - Add more vectorization to GGML - Get GGJT quantizer programs working well - Have the quantizer keep the output layer as f16c - Prefetching improves performance 15% if you use fewer threads	2023-05-16 08:07:23 -07:00
Justine Tunney	80db9de173	Make the intrinsics more readable	2023-05-15 23:12:11 -07:00
Justine Tunney	210187cf77	Perform some code cleanup	2023-05-15 16:32:10 -07:00
Justine Tunney	cc1732bc42	Make AARCH64 harder, better, faster, stronger - Perform some housekeeping on scalar math function code - Import ARM's Optimized Routines for SIMD string processing - Upgrade to latest Chromium zlib and enable more SIMD optimizations	2023-05-15 02:15:34 -07:00
Justine Tunney	550b52abf6	Port a lot more code to AARCH64 - Introduce epoll_pwait() - Rewrite -ftrapv and ffs() libraries in C code - Use more FreeBSD code in math function library - Get significantly more tests passing on qemu-aarch64 - Fix many Musl long double functions that were broken on AARCH64	2023-05-14 09:37:26 -07:00
Ariel Núñez	91791e9f38	Started removing features from RedPajama to make it easier to understand for beginners (#817 )	2023-05-14 09:16:22 -07:00
Justine Tunney	89d1fad7ee	Enable crash reports for radpajama executables	2023-05-13 21:16:03 -07:00
Justine Tunney	296ee3ec58	Make some other fixes to radpajama build config	2023-05-13 21:09:28 -07:00
Justine Tunney	282dd8e7b7	Get radpajama to build make -j8 o//third_party/radpajama/radpajama.com make -j8 o//third_party/radpajama/radpajama-chat.com This change gets the radpajama.mk config working. This package depends on THIRD_PARTY_GGML but it's configured to call ggjt_v1(), so that the library will provide the old quantizers. The ggml_quantize_chunk() API will now dispatch to older quantizers based on the configured version.	2023-05-13 20:44:36 -07:00
Justine Tunney	410c8785c9	Fix the AARCH64 build	2023-05-13 08:19:44 -07:00
Justine Tunney	5a4cf9560f	Add support for new GGJT v2 quantizers This change makes quantized models (e.g. q4_0) go 10% faster on Macs however doesn't offer much improvement for Intel PC hardware. This change syncs llama.cpp 699b1ad7fe6f7b9e41d3cb41e61a8cc3ea5fc6b5 which recently made a breaking change to nearly all its file formats without any migration. Since that'll break hundreds upon hundreds of models on websites like HuggingFace llama.com will support both file formats because llama.com will never ever break the GGJT file format	2023-05-13 08:08:32 -07:00
Justine Tunney	802e7eb4ef	Mop up more test regressions	2023-05-13 01:09:44 -07:00
Justine Tunney	4a8a81eb9f	Fix llama.com interactive mode regressions	2023-05-13 00:09:38 -07:00
Justine Tunney	fd34ef732d	Make considerably more progress on AARCH64 - Utilities like pledge.com now build - kprintf() will no longer balk at 48-bit addresses - There's a new aarch64-dbg build mode that should work - gc() and defer() are mostly pacified; avoid using them on aarch64 - THIRD_PART_STB now has Arm Neon intrinsics for fast image handling	2023-05-12 22:42:57 -07:00
Justine Tunney	1bfb3aab1b	Make Arm Neon intrinsics work with `make tags`	2023-05-12 18:32:53 -07:00
Justine Tunney	45186c74ac	Introduce -q (quiet flag) and improve ctrl-c ux	2023-05-12 09:46:07 -07:00
Justine Tunney	e8de1e4766	Fix subtoken antiprompt scanning	2023-05-12 08:55:40 -07:00
Justine Tunney	80c174d494	Clean up llama.com anti/stop/reverse-prompt code Example use case for JSON completion: $ m=opt $ make -j16 m=$m o/$m/third_party/ggml/llama.com $ o/$m/third_party/ggml/llama.com -m llama.bin -p '{"key": "life", "val": ' -r '}' 42} This provides better control. More sophisticated facilities for controlling text generation will be provided soon enough.	2023-05-12 08:20:58 -07:00
Justine Tunney	bbfe4fbd11	Make llama.com n_predict be -1 by default	2023-05-12 08:20:34 -07:00
Justine Tunney	ca19ecf49c	Fine tune crash reports for llama.com	2023-05-12 06:24:26 -07:00
Justine Tunney	4edbc98811	Get MbedTLS and its unit tests passing AARCH64	2023-05-11 21:53:15 -07:00
Justine Tunney	5e2f7f7ced	Get LIBC_TESTLIB building on AARCH64	2023-05-11 19:57:09 -07:00
Justine Tunney	95fab334e4	Use yield on aarch in spin locks	2023-05-11 19:57:09 -07:00
Ariel Núñez	b3e3359d22	Import radpajama (a redpajama.cpp fork) (#814 ) This is the relevant commit: `bfa6466199` Model download links: https://huggingface.co/ceonlabs/radpajama/tree/main	2023-05-11 07:12:08 -07:00
Justine Tunney	1f6f9e6701	Remove division from matrix multiplication This change reduces llama.com CPU cycles systemically by 2.5% according to the Linux Kernel `perf stat -Bddd` utility.	2023-05-10 21:19:54 -07:00
Justine Tunney	a88290e595	Make sure llama.com terminal cleanup happens	2023-05-10 15:56:01 -07:00
Justine Tunney	5250feb7ad	There must only be one strerror()	2023-05-10 15:34:13 -07:00
Justine Tunney	bb3ebedfce	Fix load time measurement	2023-05-10 07:54:21 -07:00
Justine Tunney	290a49952e	Fix some more issues with aarch64 and llama.cpp	2023-05-10 07:34:26 -07:00
Justine Tunney	12a33858c9	There must be only one clock()	2023-05-10 06:16:01 -07:00
Justine Tunney	6cb9553706	Fix alignment bug in llama.com	2023-05-10 06:15:32 -07:00
Justine Tunney	ca990ef091	Make `llama.com -h` print to stdout	2023-05-10 04:55:59 -07:00
Justine Tunney	5f57fc1f59	Upgrade llama.cpp to e6a46b0ed1884c77267dc70693183e3b7164e0e0	2023-05-10 04:20:48 -07:00

1 2 3 4 5 ...

606 commits