server : add min_p param (#3877)

* Update server.cpp with min_p after it was introduced in https://github.com/ggerganov/llama.cpp/pull/3841

* Use spaces instead of tabs

* Update index.html.hpp after running deps.sh

* Fix test - fix line ending
This commit is contained in:
Mihai 2023-11-09 04:00:34 +02:00 committed by GitHub
parent 875fb42871
commit 57ad015dc3
No known key found for this signature in database
GPG key ID: 4AEE18F83AFDEB23
4 changed files with 2211 additions and 2191 deletions

View file

@ -122,6 +122,8 @@ node index.js
`top_p`: Limit the next token selection to a subset of tokens with a cumulative probability above a threshold P (default: 0.95).
`min_p`: The minimum probability for a token to be considered, relative to the probability of the most likely token (default: 0.05).
`n_predict`: Set the maximum number of tokens to predict when generating text. **Note:** May exceed the set limit slightly if the last token is a partial multibyte character. When 0, no tokens will be generated but the prompt is evaluated into the cache. (default: -1, -1 = infinity).
`n_keep`: Specify the number of tokens from the prompt to retain when the context size is exceeded and tokens need to be discarded.