mirror of
https://github.com/jart/cosmopolitan.git
synced 2025-01-31 11:37:35 +00:00
.. | ||
common.cc | ||
common.h | ||
companionai.txt | ||
fp16.c | ||
fp16.h | ||
fp16.internal.h | ||
ggjt.v1.c | ||
ggjt.v1.internal.h | ||
ggjt.v1.q4_0.c | ||
ggjt.v1.q4_0.h | ||
ggjt.v1.q4_1.c | ||
ggjt.v1.q4_1.h | ||
ggjt.v1.q4_2.c | ||
ggjt.v1.q4_2.h | ||
ggjt.v1.q5_0.c | ||
ggjt.v1.q5_0.h | ||
ggjt.v1.q5_1.c | ||
ggjt.v1.q5_1.h | ||
ggjt.v1.q8_0.c | ||
ggjt.v1.q8_0.h | ||
ggjt.v1.q8_1.c | ||
ggjt.v1.q8_1.h | ||
ggml.c | ||
ggml.h | ||
ggml.mk | ||
LICENSE | ||
llama.cc | ||
llama.h | ||
llama_util.h | ||
main.cc | ||
perplexity.cc | ||
quantize.cc | ||
README.cosmo |
DESCRIPTION ggml is a machine learning library useful for LLM inference on CPUs LICENSE MIT ORIGIN https://github.com/ggerganov/llama.cpp commit 0b2da20538d01926b77ea237dd1c930c4d20b686 Author: Stephan Walter <stephan@walter.name> Date: Wed Apr 26 20:26:42 2023 +0000 ggml : slightly faster AVX2 implementation for Q5 (#1197) LOCAL CHANGES - Make it possible for loaded prompts to be cached to disk - Introduce -v and --verbose flags - Reduce batch size from 512 to 32 - Allow --n_keep to specify a substring of prompt - Don't print stats / diagnostics unless -v is passed - Reduce --top_p default from 0.95 to 0.70 - Change --reverse-prompt to no longer imply --interactive - Permit --reverse-prompt specifying custom EOS if non-interactive - Refactor headers per cosmo convention - Replace code like 'ggjt' with READ32BE("ggjt") - Remove C++ exceptions; use Die() function instead - Removed division from matrix multiplication. - Let quantizer convert between ggmt formats