build : add compile option to force use of MMQ kernels

2023-10-27 13:21:04 +03:00 · 2023-10-27 13:21:04 +03:00 · 49af767fad
commit 49af767fad
parent a4e15a36e4
3 changed files with 11 additions and 0 deletions
--- a/ggml-cuda.cu
+++ b/ggml-cuda.cu
@ -92,6 +92,7 @@
 // for large computational tasks. the drawback is that this requires some extra amount of VRAM:
 // -  7B quantum model: +100-200 MB
 // - 13B quantum model: +200-400 MB
+//
 //#define GGML_CUDA_FORCE_MMQ

 // TODO: improve this to be correct for more hardware