cosmopolitan/third_party/ggml/README.cosmo
Justine Tunney 8fdb31681a
Introduce support for GGJT v3 file format
llama.com can now load weights that use the new file format which was
introduced a few weeks ago. Note that, unlike llama.cpp, we will keep
support for old file formats in our tool so you don't need to convert
your weights when the upstream project makes breaking changes. Please
note that using ggjt v3 does make avx2 inference go 5% faster for me.
2023-06-03 15:46:21 -07:00

28 lines
870 B
Text

DESCRIPTION
ggml is a machine learning library useful for LLM inference on CPUs
LICENSE
MIT
ORIGIN
https://github.com/ggerganov/llama.cpp
d8bd0013e8768aaa3dc9cfc1ff01499419d5348e
LOCAL CHANGES
- Maintaining support for deprecated file formats
- Make it possible for loaded prompts to be cached to disk
- Introduce -v and --verbose flags
- Reduce batch size from 512 to 32
- Allow --n_keep to specify a substring of prompt
- Don't print stats / diagnostics unless -v is passed
- Reduce --top_p default from 0.95 to 0.70
- Change --reverse-prompt to no longer imply --interactive
- Permit --reverse-prompt specifying custom EOS if non-interactive
- Refactor headers per cosmo convention
- Remove C++ exceptions; use Die() function instead
- Removed division from matrix multiplication.
- Let quantizer convert between ggmt formats