fix(doc): update container tag from server to server-cuda for README example on running server container with CUDA

2024-01-27 12:03:29 -06:00 · 2024-01-27 12:03:29 -06:00 · 734cf1096b
commit 734cf1096b
parent 7298e97947
1 changed files with 1 additions and 1 deletions
--- a/examples/server/README.md
+++ b/examples/server/README.md
@ -70,7 +70,7 @@ You can consume the endpoints with Postman or NodeJS with axios library. You can
 docker run -p 8080:8080 -v /path/to/models:/models ggerganov/llama.cpp:server -m models/7B/ggml-model.gguf -c 512 --host 0.0.0.0 --port 8080

 # or, with CUDA:
-docker run -p 8080:8080 -v /path/to/models:/models --gpus all ggerganov/llama.cpp:server -m models/7B/ggml-model.gguf -c 512 --host 0.0.0.0 --port 8080 --n-gpu-layers 99
+docker run -p 8080:8080 -v /path/to/models:/models --gpus all ggerganov/llama.cpp:server-cuda -m models/7B/ggml-model.gguf -c 512 --host 0.0.0.0 --port 8080 --n-gpu-layers 99
 ```

 ## Testing with CURL