docs : Quantum -> Quantized (#8666)
Some checks failed
flake8 Lint / Lint (push) Has been cancelled
Some checks failed
flake8 Lint / Lint (push) Has been cancelled
* docfix: imatrix readme, quantum models -> quantized models. * docfix: server readme: quantum models -> quantized models.
This commit is contained in:
parent
8a4bad50a8
commit
4b0eff3df5
2 changed files with 2 additions and 2 deletions
|
@ -5,7 +5,7 @@ Fast, lightweight, pure C/C++ HTTP server based on [httplib](https://github.com/
|
|||
Set of LLM REST APIs and a simple web front end to interact with llama.cpp.
|
||||
|
||||
**Features:**
|
||||
* LLM inference of F16 and quantum models on GPU and CPU
|
||||
* LLM inference of F16 and quantized models on GPU and CPU
|
||||
* [OpenAI API](https://github.com/openai/openai-openapi) compatible chat completions and embeddings routes
|
||||
* Parallel decoding with multi-user support
|
||||
* Continuous batching
|
||||
|
|
Loading…
Add table
Add a link
Reference in a new issue