Correct the parameters of type given.
By given `q4_0` as the type will cause this error `llama_model_quantize: failed to quantize: invalid output file type 0`. And per the doc, the type should be 2 if we need q4_0 type = 2 - q4_0 type = 3 - q4_1
This commit is contained in:
parent
11d902364b
commit
ef551af6c1
1 changed files with 1 additions and 1 deletions
|
@ -271,7 +271,7 @@ python3 -m pip install -r requirements.txt
|
|||
python3 convert.py models/7B/
|
||||
|
||||
# quantize the model to 4-bits (using q4_0 method)
|
||||
./quantize ./models/7B/ggml-model-f16.bin ./models/7B/ggml-model-q4_0.bin q4_0
|
||||
./quantize ./models/7B/ggml-model-f16.bin ./models/7B/ggml-model-q4_0.bin 2
|
||||
|
||||
# run the inference
|
||||
./main -m ./models/7B/ggml-model-q4_0.bin -n 128
|
||||
|
|
Loading…
Add table
Add a link
Reference in a new issue