Allow "quantizing" to f16 and f32 (#1787)

* Allow "quantizing" to f16 and f32

Fix an issue where quantizing didn't respect LLAMA_NO_K_QUANTS

Add brief help to the list of quantization types in the quantize tool

Ignore case for quantization type arguments in the quantize tool
This commit is contained in:
Kerfuffle 2023-06-13 04:23:23 -06:00 committed by GitHub
parent 74a6d922f1
commit 74d4cfa343
No known key found for this signature in database
GPG key ID: 4AEE18F83AFDEB23
4 changed files with 154 additions and 48 deletions

View file

@ -127,6 +127,7 @@ endif
ifndef LLAMA_NO_K_QUANTS
CFLAGS += -DGGML_USE_K_QUANTS
CXXFLAGS += -DGGML_USE_K_QUANTS
OBJS += k_quants.o
endif