ggml : add Q8_0 quantization for intermediate results (#951)
* ggml : add Q8_0 quantization for intermediate results * quantize-stats : fix test + add it to Makefile default * Q8: use int8_t, AVX/AVX2 optimizations * ggml : fix quantize_row_q8_0() ARM_NEON rounding * minor : updates after rebase to latest master * quantize-stats : delete obsolete strings * ggml : fix q4_1 dot func --------- Co-authored-by: Stephan Walter <stephan@walter.name>
This commit is contained in:
parent
aa485cee33
commit
e95b6554b4
3 changed files with 442 additions and 18 deletions
2
Makefile
2
Makefile
|
@ -133,7 +133,7 @@ $(info I CC: $(CCV))
|
|||
$(info I CXX: $(CXXV))
|
||||
$(info )
|
||||
|
||||
default: main quantize perplexity embedding
|
||||
default: main quantize quantize-stats perplexity embedding
|
||||
|
||||
#
|
||||
# Build library
|
||||
|
|
Loading…
Add table
Add a link
Reference in a new issue