server : add n_indent parameter for line indentation requirement (#9929)

ggml-ci
This commit is contained in:
Georgi Gerganov 2024-10-18 07:32:19 +03:00 committed by GitHub
parent 6f55bccbb8
commit 8901755ba3
No known key found for this signature in database
GPG key ID: B5690EEEBB952194
2 changed files with 49 additions and 7 deletions

View file

@ -333,6 +333,8 @@ node index.js
`n_predict`: Set the maximum number of tokens to predict when generating text. **Note:** May exceed the set limit slightly if the last token is a partial multibyte character. When 0, no tokens will be generated but the prompt is evaluated into the cache. Default: `-1`, where `-1` is infinity.
`n_indent`: Specify the minimum line indentation for the generated text in number of whitespace characters. Useful for code completion tasks. Default: `0`
`n_keep`: Specify the number of tokens from the prompt to retain when the context size is exceeded and tokens need to be discarded. The number excludes the BOS token.
By default, this value is set to `0`, meaning no tokens are kept. Use `-1` to retain all tokens from the prompt.