server: tests: add a note regarding inference speed.

This commit is contained in:
Pierrick HYMBERT 2024-02-23 22:25:39 +01:00
parent 71831494b1
commit 2d107babc4

View file

@ -1,6 +1,12 @@
# Server Integration Test # Server tests
Server tests scenario using [BDD](https://en.wikipedia.org/wiki/Behavior-driven_development) with [behave](https://behave.readthedocs.io/en/latest/). Python based server tests scenario using [BDD](https://en.wikipedia.org/wiki/Behavior-driven_development) and [behave](https://behave.readthedocs.io/en/latest/).
Tests target GitHub workflows job runners with 4 vCPU.
Requests are using [aiohttp](https://docs.aiohttp.org/en/stable/client_reference.html), [asyncio](https://docs.python.org/fr/3/library/asyncio.html) based http client.
Note: If the host architecture inference speed is faster than GitHub runners one, parallel scenario may randomly fail. To mitigate it, you can increase values in `n_predict`, `kv_size`.
### Install dependencies ### Install dependencies
`pip install -r requirements.txt` `pip install -r requirements.txt`
@ -14,7 +20,7 @@ Server tests scenario using [BDD](https://en.wikipedia.org/wiki/Behavior-driven_
It's possible to override some scenario steps values with environment variables: It's possible to override some scenario steps values with environment variables:
- `PORT` -> `context.server_port` to set the listening port of the server during scenario, default: `8080` - `PORT` -> `context.server_port` to set the listening port of the server during scenario, default: `8080`
- `LLAMA_SERVER_BIN_PATH` -> to change the server binary path, default: `../../../build/bin/server` - `LLAMA_SERVER_BIN_PATH` -> to change the server binary path, default: `../../../build/bin/server`
- `DEBUG` -> "ON" to enable server verbose mode `--verbose` - `DEBUG` -> "ON" to enable server verbose mode `--verbose`
### Run @bug, @wip or @wrong_usage annotated scenario ### Run @bug, @wip or @wrong_usage annotated scenario