Update README.md

2023-06-17 21:34:24 +02:00 · 2023-06-17 21:34:24 +02:00 · cbb31807a3
commit cbb31807a3
parent f89c7592eb
1 changed files with 1 additions and 1 deletions
--- a/README.md
+++ b/README.md
@ -1,7 +1,7 @@
 llama.cpp modification to run Falcon (work in progress)

 Status/Bugs:  
-* Quantization works except for Q_K_ types  
+* Quantization with QK_ type appear to fail on 7B models. (Q_ works on both, QK_ works on 40B)
 * CUDA not yet functional
 * python conversion script is very basic (produces ggml v0)
 * On linux Q5_1 7B user reports a batch token ingestion context memory issue, with -b 1 it's gone. Not reproduced on Windows