Update README.md

This commit is contained in:
John 2023-06-17 21:34:24 +02:00 committed by GitHub
parent f89c7592eb
commit cbb31807a3
No known key found for this signature in database
GPG key ID: 4AEE18F83AFDEB23

View file

@ -1,7 +1,7 @@
llama.cpp modification to run Falcon (work in progress)
Status/Bugs:
* Quantization works except for Q_K_ types
* Quantization with QK_ type appear to fail on 7B models. (Q_ works on both, QK_ works on 40B)
* CUDA not yet functional
* python conversion script is very basic (produces ggml v0)
* On linux Q5_1 7B user reports a batch token ingestion context memory issue, with -b 1 it's gone. Not reproduced on Windows