server : add min_p param (#3877)
* Update server.cpp with min_p after it was introduced in https://github.com/ggerganov/llama.cpp/pull/3841 * Use spaces instead of tabs * Update index.html.hpp after running deps.sh * Fix test - fix line ending
This commit is contained in:
parent
875fb42871
commit
57ad015dc3
4 changed files with 2211 additions and 2191 deletions
|
@ -122,6 +122,8 @@ node index.js
|
|||
|
||||
`top_p`: Limit the next token selection to a subset of tokens with a cumulative probability above a threshold P (default: 0.95).
|
||||
|
||||
`min_p`: The minimum probability for a token to be considered, relative to the probability of the most likely token (default: 0.05).
|
||||
|
||||
`n_predict`: Set the maximum number of tokens to predict when generating text. **Note:** May exceed the set limit slightly if the last token is a partial multibyte character. When 0, no tokens will be generated but the prompt is evaluated into the cache. (default: -1, -1 = infinity).
|
||||
|
||||
`n_keep`: Specify the number of tokens from the prompt to retain when the context size is exceeded and tokens need to be discarded.
|
||||
|
|
Loading…
Add table
Add a link
Reference in a new issue