RMSE-optimized quants for all quantization types

By default this new option is ON. One can turn it off by setting LLAMA_NO_RMSE. With this option enabled, the Q4_3 quantization results in a perplexity of 6.0344, so 0.0273 lower than simple Q4_3 quantization.
2023-04-21 10:26:49 +02:00 · 2023-04-21 10:26:49 +02:00 · e435bfd93c
commit e435bfd93c
parent 0e018fe008
3 changed files with 286 additions and 80 deletions
--- a/4
+++ b/4
@ -134,6 +134,10 @@ ifneq ($(filter armv8%,$(UNAME_M)),)
 	CFLAGS += -mfp16-format=ieee -mno-unaligned-access
 endif

+ifdef LLAMA_NO_RMSE
+	CFLAGS += -DGGML_NO_RMSE
+endif
+
 #
 # Print build information
 #