server: bench: minor fixes (#10765)
* server/bench: - support openAI streaming standard output with [DONE]\n\n - export k6 raw results in csv - fix too many tcp idle connection in tcp_wait - add metric time to emit first token * server/bench: - fix when prometheus not started - wait for server to be ready before starting bench
This commit is contained in:
parent
0da5d86026
commit
2f0ee84b9b
3 changed files with 39 additions and 15 deletions
|
@ -6,10 +6,10 @@ Benchmark is using [k6](https://k6.io/).
|
|||
|
||||
SSE is not supported by default in k6, you have to build k6 with the [xk6-sse](https://github.com/phymbert/xk6-sse) extension.
|
||||
|
||||
Example:
|
||||
Example (assuming golang >= 1.21 is installed):
|
||||
```shell
|
||||
go install go.k6.io/xk6/cmd/xk6@latest
|
||||
xk6 build master \
|
||||
$GOPATH/bin/xk6 build master \
|
||||
--with github.com/phymbert/xk6-sse
|
||||
```
|
||||
|
||||
|
@ -33,7 +33,7 @@ The server must answer OAI Chat completion requests on `http://localhost:8080/v1
|
|||
|
||||
Example:
|
||||
```shell
|
||||
server --host localhost --port 8080 \
|
||||
llama-server --host localhost --port 8080 \
|
||||
--model ggml-model-q4_0.gguf \
|
||||
--cont-batching \
|
||||
--metrics \
|
||||
|
|
Loading…
Add table
Add a link
Reference in a new issue