server : output embeddings for all tokens when pooling = none (#10861)

* server : add "tokens" output

ggml-ci

* server : output embeddings for all tokens when pooling = none

ggml-ci

* server : update readme [no ci]

* server : fix spacing [no ci]

Co-authored-by: Xuan Son Nguyen <thichthat@gmail.com>

* server : be explicit about the pooling type in the tests

ggml-ci

* server : update /embeddings and /v1/embeddings endpoints

ggml-ci

* server : do not normalize embeddings when there is no pooling

ggml-ci

* server : update readme

ggml-ci

* server : fixes

* tests : update server tests

ggml-ci

* server : update readme [no ci]

* server : remove rebase artifact

---------

Co-authored-by: Xuan Son Nguyen <thichthat@gmail.com>
This commit is contained in:
Georgi Gerganov 2024-12-18 13:01:41 +02:00 committed by GitHub
parent 0e70ba686e
commit 152610eda9
No known key found for this signature in database
GPG key ID: B5690EEEBB952194
8 changed files with 158 additions and 37 deletions

View file

@ -65,6 +65,7 @@ class ServerProcess:
server_reranking: bool | None = False
server_metrics: bool | None = False
server_slots: bool | None = False
pooling: str | None = None
draft: int | None = None
api_key: str | None = None
response_format: str | None = None
@ -132,6 +133,8 @@ class ServerProcess:
server_args.append("--metrics")
if self.server_slots:
server_args.append("--slots")
if self.pooling:
server_args.extend(["--pooling", self.pooling])
if self.model_alias:
server_args.extend(["--alias", self.model_alias])
if self.n_ctx: