tts : add OuteTTS support (#10784)

* server : add "tokens" output

ggml-ci

* server : output embeddings for all tokens when pooling = none

ggml-ci

* server : be explicit about the pooling type in the tests

ggml-ci

* server : do not normalize embeddings when there is no pooling

ggml-ci

* llama : add OuteTTS support (wip)

* wip

* extract features

* first conv

* group norm

* resnet conv

* resnet

* attn

* pos net

* layer norm

* convnext

* head

* hann window

* fix n_embd + remove llama.cpp hacks

* compute hann window

* fft

* spectrum processing

* clean-up

* tts : receive input text and generate codes

* clip : fix new conv name

* tts : minor fix

* tts : add header + minor fixes

ggml-ci

* tts : add matchematical constant

ggml-ci

* tts : fix sampling + cut initial noise

* tts : fixes

* tts : update default samplers

ggml-ci

* tts : text pre-processing

* tts : outetts-voc -> wavtokenizer-dec

* tts : remove hardcoded constants

ggml-ci

* tts : fix tensor shapes

* llama : refactor wavtokenizer tensors

ggml-ci

* cont

ggml-ci

* cont [no ci]

* llama : update WavTokenizer to non-causal attn

* llama : handle no-vocab detokenization

* tts : add Python example for OuteTTS (wip)

* tts : extend python example to generate spectrogram

ggml-ci

* server : fix rebase artifacts

* tts : enable "return_tokens" in Python example

ggml-ci

* tts : minor fixes

* common : support HF download for vocoder

This commit is contained in:

Georgi Gerganov

2024-12-18 19:27:21 +02:00

• committed by

GitHub

parent 7bbb5acf12

commit 0bf2d10c55

No known key found for this signature in database

GPG key ID: B5690EEEBB952194

19 changed files with 2509 additions and 532 deletions

									
										4

src/llama-vocab.cpp
									
										View file
										
				@ -1867,6 +1867,10 @@ int32_t llama_detokenize_impl(

				                         int32_t   text_len_max,

				                            bool   remove_special,

				                            bool   unparse_special) {

				    if (vocab.type == LLAMA_VOCAB_TYPE_NONE) {

				        return 0;

				    }

				    GGML_ASSERT(vocab.tokenizer && "Tokenizer not initialized. Call llama_vocab::init_tokenizer() first.");

				    int32_t avail = text_len_max;

Rows
Columns

tts : add OuteTTS support (#10784)

4 src/llama-vocab.cpp Unescape Escape View file

4

src/llama-vocab.cpp

View file