Commit graph

4395 commits

Author SHA1 Message Date
Georgi Gerganov
5038abe1ee
tts : add Python example for OuteTTS (wip) 2024-12-18 14:02:27 +02:00
Georgi Gerganov
d291c74253
llama : handle no-vocab detokenization 2024-12-18 14:02:26 +02:00
Georgi Gerganov
824fa750d4
llama : update WavTokenizer to non-causal attn 2024-12-18 14:02:26 +02:00
Georgi Gerganov
2033fb7eef
cont [no ci] 2024-12-18 14:02:26 +02:00
Georgi Gerganov
35259e5335
cont
ggml-ci
2024-12-18 14:02:26 +02:00
Georgi Gerganov
980d631032
llama : refactor wavtokenizer tensors
ggml-ci
2024-12-18 14:02:26 +02:00
Georgi Gerganov
d1ef627c51
tts : fix tensor shapes 2024-12-18 14:02:26 +02:00
Georgi Gerganov
c096bbd8dd
tts : remove hardcoded constants
ggml-ci
2024-12-18 14:02:25 +02:00
Georgi Gerganov
e70f140c04
tts : outetts-voc -> wavtokenizer-dec 2024-12-18 14:02:25 +02:00
Georgi Gerganov
befdcd2492
tts : text pre-processing 2024-12-18 14:02:25 +02:00
Georgi Gerganov
3d54be4d84
tts : update default samplers
ggml-ci
2024-12-18 14:02:25 +02:00
Georgi Gerganov
1d7c27ca93
tts : fixes 2024-12-18 14:02:25 +02:00
Georgi Gerganov
906a0edb5a
tts : fix sampling + cut initial noise 2024-12-18 14:02:24 +02:00
Georgi Gerganov
2221e54278
tts : add matchematical constant
ggml-ci
2024-12-18 14:02:24 +02:00
Georgi Gerganov
d4fa34bdd4
tts : add header + minor fixes
ggml-ci
2024-12-18 14:02:24 +02:00
Georgi Gerganov
8329e850cc
tts : minor fix 2024-12-18 14:02:24 +02:00
Georgi Gerganov
db613915de
clip : fix new conv name 2024-12-18 14:02:24 +02:00
Georgi Gerganov
b9a011e123
tts : receive input text and generate codes 2024-12-18 14:02:24 +02:00
Georgi Gerganov
191da330fc
clean-up 2024-12-18 14:02:23 +02:00
Georgi Gerganov
e52797162e
spectrum processing 2024-12-18 14:02:23 +02:00
Georgi Gerganov
5a1c98e8d2
fft 2024-12-18 14:02:23 +02:00
Georgi Gerganov
e728cfd297
compute hann window 2024-12-18 14:02:23 +02:00
Georgi Gerganov
a1f08ad338
fix n_embd + remove llama.cpp hacks 2024-12-18 14:02:23 +02:00
Georgi Gerganov
eb1b70f42a
hann window 2024-12-18 14:02:23 +02:00
Georgi Gerganov
839035d1bb
head 2024-12-18 14:02:22 +02:00
Georgi Gerganov
fe6dd5aa61
convnext 2024-12-18 14:02:22 +02:00
Georgi Gerganov
b3ba05e5bc
layer norm 2024-12-18 14:02:22 +02:00
Georgi Gerganov
435cfd788b
pos net 2024-12-18 14:02:22 +02:00
Georgi Gerganov
3046fde420
attn 2024-12-18 14:02:22 +02:00
Georgi Gerganov
13dd8941a4
resnet 2024-12-18 14:02:22 +02:00
Georgi Gerganov
3d08d62b6c
resnet conv 2024-12-18 14:02:21 +02:00
Georgi Gerganov
5296c96ca8
group norm 2024-12-18 14:02:21 +02:00
Georgi Gerganov
6ef14091c0
first conv 2024-12-18 14:02:21 +02:00
Georgi Gerganov
aac7e04953
extract features 2024-12-18 14:02:21 +02:00
Georgi Gerganov
ff2ea75fb4
wip 2024-12-18 14:02:21 +02:00
Georgi Gerganov
f169965158
llama : add OuteTTS support (wip) 2024-12-18 14:02:20 +02:00
Georgi Gerganov
e65556f174
server : do not normalize embeddings when there is no pooling
ggml-ci
2024-12-18 14:02:05 +02:00
Georgi Gerganov
1b18b2d7b0
server : be explicit about the pooling type in the tests
ggml-ci
2024-12-18 14:01:22 +02:00
Georgi Gerganov
06e85401b0
server : output embeddings for all tokens when pooling = none
ggml-ci
2024-12-18 14:00:50 +02:00
Georgi Gerganov
89eaf5036a
server : add "tokens" output
ggml-ci
2024-12-18 13:59:47 +02:00
Georgi Gerganov
152610eda9
server : output embeddings for all tokens when pooling = none (#10861)
* server : add "tokens" output

ggml-ci

* server : output embeddings for all tokens when pooling = none

ggml-ci

* server : update readme [no ci]

* server : fix spacing [no ci]

Co-authored-by: Xuan Son Nguyen <thichthat@gmail.com>

* server : be explicit about the pooling type in the tests

ggml-ci

* server : update /embeddings and /v1/embeddings endpoints

ggml-ci

* server : do not normalize embeddings when there is no pooling

ggml-ci

* server : update readme

ggml-ci

* server : fixes

* tests : update server tests

ggml-ci

* server : update readme [no ci]

* server : remove rebase artifact

---------

Co-authored-by: Xuan Son Nguyen <thichthat@gmail.com>
2024-12-18 13:01:41 +02:00
Georgi Gerganov
0e70ba686e
server : add "tokens" output (#10853)
* server : add "tokens" output

ggml-ci

* server : update readme

ggml-ci

* server : return tokens ids only if requested

ggml-ci

* tests : improve "tokens" type check

Co-authored-by: Xuan Son Nguyen <thichthat@gmail.com>

* server : remove "tokens" from the OAI endpoint

ggml-ci

---------

Co-authored-by: Xuan Son Nguyen <thichthat@gmail.com>
2024-12-18 11:05:29 +02:00
Xuan Son Nguyen
46828872c3
server : (embeddings) using same format for "input" and "content" (#10872)
* server : (embeddings) using same format for "input" and "content"

* fix test case

* handle empty input case

* fix test
2024-12-18 10:55:09 +02:00
redbeard
6b064c92b4
docs: Fix HIP (née hipBLAS) in README (#10880)
Related to #10524 / be0e350c references to hipBLAS have been removed
across the repository.  This fixes the link from the repositories
`README.md`.

Signed-off-by: Brian 'redbeard' Harrington <redbeard@dead-city.org>
2024-12-18 10:35:00 +02:00
Diego Devesa
4da69d1abd
Revert "llama : add Falcon3 support (#10864)" (#10876)
This reverts commit 382bc7f2e8.
2024-12-18 01:36:46 +01:00
DAN™
d62b532c52
Use model->gguf_kv for loading the template instead of using the C API. (#10868)
* Bump model_template to 16384 bytes to support larger chat templates.

* Use `model->gguf_kv` for efficiency.
2024-12-17 23:24:22 +01:00
Johannes Gäßler
081b29bd2a
tests: add tests for GGUF (#10830) 2024-12-17 19:09:35 +01:00
Georgi Gerganov
5437d4aaf5
sync : ggml 2024-12-17 18:36:02 +02:00
Georgi Gerganov
78f766768d
cmake : fix "amd64" processor string (whisper/2638) 2024-12-17 18:35:49 +02:00
gn64
8dd19a4812
vulkan : fix soft_max.comp division by zero (whisper/2633)
This change prevents a division by zero error when p.KY is 0.
2024-12-17 18:35:49 +02:00