RMSE-optimized quants for all quantization types
By default this new option is ON. One can turn it off by setting LLAMA_NO_RMSE. With this option enabled, the Q4_3 quantization results in a perplexity of 6.0344, so 0.0273 lower than simple Q4_3 quantization.
This commit is contained in:
parent
0e018fe008
commit
e435bfd93c
3 changed files with 286 additions and 80 deletions
4
Makefile
4
Makefile
|
@ -134,6 +134,10 @@ ifneq ($(filter armv8%,$(UNAME_M)),)
|
|||
CFLAGS += -mfp16-format=ieee -mno-unaligned-access
|
||||
endif
|
||||
|
||||
ifdef LLAMA_NO_RMSE
|
||||
CFLAGS += -DGGML_NO_RMSE
|
||||
endif
|
||||
|
||||
#
|
||||
# Print build information
|
||||
#
|
||||
|
|
Loading…
Add table
Add a link
Reference in a new issue