chore: add references to the quantisation space.
This commit is contained in:
parent
9f773486ab
commit
3775d0debb
2 changed files with 7 additions and 1 deletions
|
@ -712,6 +712,10 @@ Building the program with BLAS support may lead to some performance improvements
|
||||||
|
|
||||||
### Prepare and Quantize
|
### Prepare and Quantize
|
||||||
|
|
||||||
|
Note: You can use the [GGUF-my-repo](https://huggingface.co/spaces/ggml-org/gguf-my-repo) space on Hugging Face to quantise your model weights without any setup too.
|
||||||
|
|
||||||
|
It is synced to `llama.cpp` main every 6 hours.
|
||||||
|
|
||||||
To obtain the official LLaMA 2 weights please see the <a href="#obtaining-and-using-the-facebook-llama-2-model">Obtaining and using the Facebook LLaMA 2 model</a> section. There is also a large selection of pre-quantized `gguf` models available on Hugging Face.
|
To obtain the official LLaMA 2 weights please see the <a href="#obtaining-and-using-the-facebook-llama-2-model">Obtaining and using the Facebook LLaMA 2 model</a> section. There is also a large selection of pre-quantized `gguf` models available on Hugging Face.
|
||||||
|
|
||||||
Note: `convert.py` does not support LLaMA 3, you can use `convert-hf-to-gguf.py` with LLaMA 3 downloaded from Hugging Face.
|
Note: `convert.py` does not support LLaMA 3, you can use `convert-hf-to-gguf.py` with LLaMA 3 downloaded from Hugging Face.
|
||||||
|
|
|
@ -1,6 +1,8 @@
|
||||||
# quantize
|
# quantize
|
||||||
|
|
||||||
TODO
|
You can also use the [GGUF-my-repo](https://huggingface.co/spaces/ggml-org/gguf-my-repo) space on Hugging Face to build your own quants without any setup.
|
||||||
|
|
||||||
|
Note: It is synced to llama.cpp `main` every 6 hours.
|
||||||
|
|
||||||
## Llama 2 7B
|
## Llama 2 7B
|
||||||
|
|
||||||
|
|
Loading…
Add table
Add a link
Reference in a new issue