Correct the parameters of type given.

By given `q4_0` as the type will cause this error `llama_model_quantize: failed to quantize: invalid output file type 0`. 
And per the doc, the type should be 2 if we need q4_0
  type = 2 - q4_0
  type = 3 - q4_1
This commit is contained in:
Wen Shi 2023-04-28 23:58:30 +08:00 committed by GitHub
parent 11d902364b
commit ef551af6c1
No known key found for this signature in database
GPG key ID: 4AEE18F83AFDEB23

View file

@ -271,7 +271,7 @@ python3 -m pip install -r requirements.txt
python3 convert.py models/7B/
# quantize the model to 4-bits (using q4_0 method)
./quantize ./models/7B/ggml-model-f16.bin ./models/7B/ggml-model-q4_0.bin q4_0
./quantize ./models/7B/ggml-model-f16.bin ./models/7B/ggml-model-q4_0.bin 2
# run the inference
./main -m ./models/7B/ggml-model-q4_0.bin -n 128