server: Add "tokens per second" information in the backend (#10548)
* add cmake rvv support * add timings * remove space * update readme * fix * fix code * remove empty line * add test --------- Co-authored-by: Xuan Son Nguyen <son@huggingface.co>
This commit is contained in:
parent
991f8aabee
commit
64ed2091b2
5 changed files with 44 additions and 1 deletions
|
@ -416,6 +416,8 @@ node index.js
|
|||
|
||||
`samplers`: The order the samplers should be applied in. An array of strings representing sampler type names. If a sampler is not set, it will not be used. If a sampler is specified more than once, it will be applied multiple times. Default: `["dry", "top_k", "typ_p", "top_p", "min_p", "xtc", "temperature"]` - these are all the available values.
|
||||
|
||||
`timings_per_token`: Include prompt processing and text generation speed information in each response. Default: `false`
|
||||
|
||||
**Response format**
|
||||
|
||||
- Note: When using streaming mode (`stream`), only `content` and `stop` will be returned until end of completion.
|
||||
|
|
Loading…
Add table
Add a link
Reference in a new issue