server : update readme
ggml-ci
This commit is contained in:
parent
7e693f92d7
commit
3a7c001fe3
1 changed files with 41 additions and 1 deletions
|
@ -763,6 +763,8 @@ curl http://localhost:8080/v1/chat/completions \
|
||||||
|
|
||||||
### POST `/v1/embeddings`: OpenAI-compatible embeddings API
|
### POST `/v1/embeddings`: OpenAI-compatible embeddings API
|
||||||
|
|
||||||
|
This endpoint requires that the model uses a pooling different than type `none`.
|
||||||
|
|
||||||
*Options:*
|
*Options:*
|
||||||
|
|
||||||
See [OpenAI Embeddings API documentation](https://platform.openai.com/docs/api-reference/embeddings).
|
See [OpenAI Embeddings API documentation](https://platform.openai.com/docs/api-reference/embeddings).
|
||||||
|
@ -795,7 +797,45 @@ See [OpenAI Embeddings API documentation](https://platform.openai.com/docs/api-r
|
||||||
}'
|
}'
|
||||||
```
|
```
|
||||||
|
|
||||||
When `--pooling none` is used, the server will output an array of embeddings - one for each token in the input.
|
### POST `/embeddings`: non-OpenAI-compatible embeddings API
|
||||||
|
|
||||||
|
This endpoint supports `--pooling none`. When used, the responses will contain the embeddings for all input tokens.
|
||||||
|
Note that the response format is slightly different than `/v1/embeddings` - it does not have the `"data"` sub-tree and the
|
||||||
|
embeddings are always returned as vector of vectors.
|
||||||
|
|
||||||
|
*Options:*
|
||||||
|
|
||||||
|
Same as the `/v1/embeddings` endpoint.
|
||||||
|
|
||||||
|
*Examples:*
|
||||||
|
|
||||||
|
Same as the `/v1/embeddings` endpoint.
|
||||||
|
|
||||||
|
**Response format**
|
||||||
|
|
||||||
|
```json
|
||||||
|
[
|
||||||
|
{
|
||||||
|
"index": 0,
|
||||||
|
"embedding": [
|
||||||
|
[ ... embeddings for token 0 ... ],
|
||||||
|
[ ... embeddings for token 1 ... ],
|
||||||
|
[ ... ]
|
||||||
|
[ ... embeddings for token N-1 ... ],
|
||||||
|
]
|
||||||
|
},
|
||||||
|
...
|
||||||
|
{
|
||||||
|
"index": P,
|
||||||
|
"embedding": [
|
||||||
|
[ ... embeddings for token 0 ... ],
|
||||||
|
[ ... embeddings for token 1 ... ],
|
||||||
|
[ ... ]
|
||||||
|
[ ... embeddings for token N-1 ... ],
|
||||||
|
]
|
||||||
|
}
|
||||||
|
]
|
||||||
|
```
|
||||||
|
|
||||||
### GET `/slots`: Returns the current slots processing state
|
### GET `/slots`: Returns the current slots processing state
|
||||||
|
|
||||||
|
|
Loading…
Add table
Add a link
Reference in a new issue