server : add /apply-template endpoint for additional use cases of Minja functionality (#11489)

* add /apply-template endpoint to server

* remove unnecessary line

* add /apply-template documentation

* return only "prompt" field in /apply-template

* use suggested idea instead of my overly verbose way
This commit is contained in:
Nigel Bosch 2025-01-29 12:45:44 -06:00 committed by GitHub
parent 66ee4f297c
commit eb7cf15a80
No known key found for this signature in database
GPG key ID: B5690EEEBB952194
3 changed files with 32 additions and 0 deletions

View file

@ -576,6 +576,14 @@ With input 'á' (utf8 hex: C3 A1) on tinyllama/stories260k
`tokens`: Set the tokens to detokenize.
### POST `/apply-template`: Apply chat template to a conversation
Uses the server's prompt template formatting functionality to convert chat messages to a single string expected by a chat model as input, but does not perform inference. Instead, the prompt string is returned in the `prompt` field of the JSON response. The prompt can then be modified as desired (for example, to insert "Sure!" at the beginning of the model's response) before sending to `/completion` to generate the chat response.
*Options:*
`messages`: (Required) Chat turns in the same format as `/v1/chat/completions`.
### POST `/embedding`: Generate embedding of a given text
> [!IMPORTANT]