cosmopolitan/third_party/ggml
Justine Tunney 1f6f9e6701
Remove division from matrix multiplication
This change reduces llama.com CPU cycles systemically by 2.5% according
to the Linux Kernel `perf stat -Bddd` utility.
2023-05-10 21:19:54 -07:00
..
common.cc Make llama.com -h print to stdout 2023-05-10 04:55:59 -07:00
common.h Fix some more issues with aarch64 and llama.cpp 2023-05-10 07:34:26 -07:00
companionai.txt Upgrade llama.cpp to e6a46b0ed1884c77267dc70693183e3b7164e0e0 2023-05-10 04:20:48 -07:00
ggml.c Remove division from matrix multiplication 2023-05-10 21:19:54 -07:00
ggml.h Upgrade llama.cpp to e6a46b0ed1884c77267dc70693183e3b7164e0e0 2023-05-10 04:20:48 -07:00
ggml.mk Remove division from matrix multiplication 2023-05-10 21:19:54 -07:00
LICENSE Import llama.cpp 2023-04-27 14:37:14 -07:00
llama.cc Fix load time measurement 2023-05-10 07:54:21 -07:00
llama.h Upgrade llama.cpp to e6a46b0ed1884c77267dc70693183e3b7164e0e0 2023-05-10 04:20:48 -07:00
llama_util.h Fix alignment bug in llama.com 2023-05-10 06:15:32 -07:00
main.cc Make sure llama.com terminal cleanup happens 2023-05-10 15:56:01 -07:00
README.cosmo Use Companion AI in llama.com by default 2023-04-30 23:08:15 -07:00

DESCRIPTION

  ggml is a machine learning library useful for LLM inference on CPUs

LICENSE

  MIT

ORIGIN

  https://github.com/ggerganov/llama.cpp
  commit 0b2da20538d01926b77ea237dd1c930c4d20b686
  Author: Stephan Walter <stephan@walter.name>
  Date:   Wed Apr 26 20:26:42 2023 +0000
  ggml : slightly faster AVX2 implementation for Q5 (#1197)

LOCAL CHANGES

  - Make it possible for loaded prompts to be cached to disk
  - Introduce -v and --verbose flags
  - Reduce batch size from 512 to 32
  - Allow --n_keep to specify a substring of prompt
  - Don't print stats / diagnostics unless -v is passed
  - Reduce --top_p default from 0.95 to 0.70
  - Change --reverse-prompt to no longer imply --interactive
  - Permit --reverse-prompt specifying custom EOS if non-interactive
  - Refactor headers per cosmo convention
  - Replace code like 'ggjt' with READ32BE("ggjt")
  - Remove C++ exceptions; use Die() function instead