RMSE-optimized quants for all quantization types

By default this new option is ON. One can turn it off
by setting LLAMA_NO_RMSE.

With this option enabled, the Q4_3 quantization results
in a perplexity  of 6.0344, so 0.0273 lower than simple
Q4_3 quantization.
This commit is contained in:
Iwan Kawrakow 2023-04-21 10:26:49 +02:00 committed by Georgi Gerganov
parent 0e018fe008
commit e435bfd93c
3 changed files with 286 additions and 80 deletions

View file

@ -134,6 +134,10 @@ ifneq ($(filter armv8%,$(UNAME_M)),)
CFLAGS += -mfp16-format=ieee -mno-unaligned-access
endif
ifdef LLAMA_NO_RMSE
CFLAGS += -DGGML_NO_RMSE
endif
#
# Print build information
#