mirror of
https://github.com/jart/cosmopolitan.git
synced 2025-02-07 15:03:34 +00:00
29 lines
928 B
Text
29 lines
928 B
Text
DESCRIPTION
|
|
|
|
ggml is a machine learning library useful for LLM inference on CPUs
|
|
|
|
LICENSE
|
|
|
|
MIT
|
|
|
|
ORIGIN
|
|
|
|
https://github.com/ggerganov/llama.cpp
|
|
commit 0b2da20538d01926b77ea237dd1c930c4d20b686
|
|
Author: Stephan Walter <stephan@walter.name>
|
|
Date: Wed Apr 26 20:26:42 2023 +0000
|
|
ggml : slightly faster AVX2 implementation for Q5 (#1197)
|
|
|
|
LOCAL CHANGES
|
|
|
|
- Make it possible for loaded prompts to be cached to disk
|
|
- Introduce -v and --verbose flags
|
|
- Reduce batch size from 512 to 32
|
|
- Allow --n_keep to specify a substring of prompt
|
|
- Don't print stats / diagnostics unless -v is passed
|
|
- Reduce --top_p default from 0.95 to 0.70
|
|
- Change --reverse-prompt to no longer imply --interactive
|
|
- Permit --reverse-prompt specifying custom EOS if non-interactive
|
|
- Refactor headers per cosmo convention
|
|
- Replace code like 'ggjt' with READ32BE("ggjt")
|
|
- Remove C++ exceptions; use Die() function instead
|