server : normalize embeddings (#5956)

* output normalize embedding in '/v1/embeddings'

* common : reuse llama_embd_normalize

* common : better normalize impl

---------

Co-authored-by: Georgi Gerganov <ggerganov@gmail.com>
This commit is contained in:
SeungWon Jeong 2024-03-09 21:27:58 +09:00 committed by GitHub
parent 2c4f566c88
commit fb215c3832
No known key found for this signature in database
GPG key ID: B5690EEEBB952194
4 changed files with 30 additions and 14 deletions

View file

@ -260,3 +260,10 @@ void dump_kv_cache_view(const llama_kv_cache_view & view, int row_size = 80);
// Dump the KV cache view showing individual sequences in each cell (long output).
void dump_kv_cache_view_seqs(const llama_kv_cache_view & view, int row_size = 40);
//
// Embedding utils
//
void llama_embd_normalize(const float * inp, float * out, int n);