Tool call support (generic + native for Llama, Functionary, Hermes, Mistral, Firefunction, DeepSeek) w/ lazy grammars (#9639)
--------- Co-authored-by: Xuan Son Nguyen <thichthat@gmail.com> Co-authored-by: Georgi Gerganov <ggerganov@gmail.com> Co-authored-by: Xuan Son Nguyen <son@huggingface.co>
This commit is contained in:
parent
27d135c970
commit
8b576b6c55
48 changed files with 3861 additions and 156 deletions
|
@ -1117,6 +1117,82 @@ curl http://localhost:8080/v1/chat/completions \
|
|||
}'
|
||||
```
|
||||
|
||||
... and even tool usage (needs `--jinja` flag):
|
||||
|
||||
```shell
|
||||
llama-server --jinja -hfr lmstudio-community/Meta-Llama-3.1-8B-Instruct-GGUF -hff Meta-Llama-3.1-8B-Instruct-Q5_K_M.gguf -fa
|
||||
|
||||
# https://huggingface.co/meetkai/functionary-medium-v3.2
|
||||
llama-server --jinja -hfr bartowski/functionary-medium-v3.2-GGUF -hff functionary-medium-v3.2-IQ4_XS.gguf -fa
|
||||
|
||||
# https://huggingface.co/meetkai/functionary-medium-v3.1
|
||||
llama-server --jinja -hfr meetkai/functionary-medium-v3.1-GGUF -hff functionary-medium-llama-3.1.Q4_0.gguf -fa
|
||||
|
||||
curl http://localhost:8080/v1/chat/completions -d '{
|
||||
"model": "gpt-3.5-turbo",
|
||||
"tools": [
|
||||
{
|
||||
"type":"function",
|
||||
"function":{
|
||||
"name":"get_current_weather",
|
||||
"description":"Get the current weather in a given location",
|
||||
"parameters":{
|
||||
"type":"object",
|
||||
"properties":{
|
||||
"location":{
|
||||
"type":"string",
|
||||
"description":"The city and state, e.g. San Francisco, CA"
|
||||
}
|
||||
},
|
||||
"required":["location"]
|
||||
}
|
||||
}
|
||||
}
|
||||
],
|
||||
"messages": [
|
||||
{
|
||||
"role": "user",
|
||||
"content": "What is the weather like in Istanbul?."
|
||||
}
|
||||
]
|
||||
}'
|
||||
```
|
||||
|
||||
<details>
|
||||
<summary>Show output</summary>
|
||||
|
||||
```json
|
||||
{
|
||||
"choices": [
|
||||
{
|
||||
"finish_reason": "tool",
|
||||
"index": 0,
|
||||
"message": {
|
||||
"content": null,
|
||||
"tool_calls": [
|
||||
{
|
||||
"name": "python",
|
||||
"arguments": "{\"code\":\" \\nprint(\\\"Hello, World!\\\")\"}"
|
||||
}
|
||||
],
|
||||
"role": "assistant"
|
||||
}
|
||||
}
|
||||
],
|
||||
"created": 1727287211,
|
||||
"model": "gpt-3.5-turbo",
|
||||
"object": "chat.completion",
|
||||
"usage": {
|
||||
"completion_tokens": 16,
|
||||
"prompt_tokens": 44,
|
||||
"total_tokens": 60
|
||||
},
|
||||
"id": "chatcmpl-Htbgh9feMmGM0LEH2hmQvwsCxq3c6Ni8"
|
||||
}
|
||||
```
|
||||
|
||||
</details>
|
||||
|
||||
### POST `/v1/embeddings`: OpenAI-compatible embeddings API
|
||||
|
||||
This endpoint requires that the model uses a pooling different than type `none`. The embeddings are normalized using the Eucledian norm.
|
||||
|
|
Loading…
Add table
Add a link
Reference in a new issue