readme: add missing info

- new integer quantization types
- AVX* support
- *BLAS support
This commit is contained in:
Pavol Rusnak 2023-05-04 20:59:19 +02:00
parent 34d9f22f44
commit 47bbd631f2
No known key found for this signature in database
GPG key ID: 91F3B339B9A02A3D

View file

@ -18,10 +18,12 @@ The main goal of `llama.cpp` is to run the LLaMA model using 4-bit integer quant
- Plain C/C++ implementation without dependencies - Plain C/C++ implementation without dependencies
- Apple silicon first-class citizen - optimized via ARM NEON and Accelerate framework - Apple silicon first-class citizen - optimized via ARM NEON and Accelerate framework
- AVX2 support for x86 architectures - AVX, AVX2 and AVX512 support for x86 architectures
- Mixed F16 / F32 precision - Mixed F16 / F32 precision
- 4-bit integer quantization support - 4-bit, 5-bit and 8-bit integer quantization support
- Runs on the CPU - Runs on the CPU
- OpenBLAS support
- cuBLAS and CLBlast support
The original implementation of `llama.cpp` was [hacked in an evening](https://github.com/ggerganov/llama.cpp/issues/33#issuecomment-1465108022). The original implementation of `llama.cpp` was [hacked in an evening](https://github.com/ggerganov/llama.cpp/issues/33#issuecomment-1465108022).
Since then, the project has improved significantly thanks to many contributions. This project is for educational purposes and serves Since then, the project has improved significantly thanks to many contributions. This project is for educational purposes and serves