Commit graph

4386 commits

Author SHA1 Message Date
Georgi Gerganov
befdcd2492
tts : text pre-processing 2024-12-18 14:02:25 +02:00
Georgi Gerganov
3d54be4d84
tts : update default samplers
ggml-ci
2024-12-18 14:02:25 +02:00
Georgi Gerganov
1d7c27ca93
tts : fixes 2024-12-18 14:02:25 +02:00
Georgi Gerganov
906a0edb5a
tts : fix sampling + cut initial noise 2024-12-18 14:02:24 +02:00
Georgi Gerganov
2221e54278
tts : add matchematical constant
ggml-ci
2024-12-18 14:02:24 +02:00
Georgi Gerganov
d4fa34bdd4
tts : add header + minor fixes
ggml-ci
2024-12-18 14:02:24 +02:00
Georgi Gerganov
8329e850cc
tts : minor fix 2024-12-18 14:02:24 +02:00
Georgi Gerganov
db613915de
clip : fix new conv name 2024-12-18 14:02:24 +02:00
Georgi Gerganov
b9a011e123
tts : receive input text and generate codes 2024-12-18 14:02:24 +02:00
Georgi Gerganov
191da330fc
clean-up 2024-12-18 14:02:23 +02:00
Georgi Gerganov
e52797162e
spectrum processing 2024-12-18 14:02:23 +02:00
Georgi Gerganov
5a1c98e8d2
fft 2024-12-18 14:02:23 +02:00
Georgi Gerganov
e728cfd297
compute hann window 2024-12-18 14:02:23 +02:00
Georgi Gerganov
a1f08ad338
fix n_embd + remove llama.cpp hacks 2024-12-18 14:02:23 +02:00
Georgi Gerganov
eb1b70f42a
hann window 2024-12-18 14:02:23 +02:00
Georgi Gerganov
839035d1bb
head 2024-12-18 14:02:22 +02:00
Georgi Gerganov
fe6dd5aa61
convnext 2024-12-18 14:02:22 +02:00
Georgi Gerganov
b3ba05e5bc
layer norm 2024-12-18 14:02:22 +02:00
Georgi Gerganov
435cfd788b
pos net 2024-12-18 14:02:22 +02:00
Georgi Gerganov
3046fde420
attn 2024-12-18 14:02:22 +02:00
Georgi Gerganov
13dd8941a4
resnet 2024-12-18 14:02:22 +02:00
Georgi Gerganov
3d08d62b6c
resnet conv 2024-12-18 14:02:21 +02:00
Georgi Gerganov
5296c96ca8
group norm 2024-12-18 14:02:21 +02:00
Georgi Gerganov
6ef14091c0
first conv 2024-12-18 14:02:21 +02:00
Georgi Gerganov
aac7e04953
extract features 2024-12-18 14:02:21 +02:00
Georgi Gerganov
ff2ea75fb4
wip 2024-12-18 14:02:21 +02:00
Georgi Gerganov
f169965158
llama : add OuteTTS support (wip) 2024-12-18 14:02:20 +02:00
Georgi Gerganov
e65556f174
server : do not normalize embeddings when there is no pooling
ggml-ci
2024-12-18 14:02:05 +02:00
Georgi Gerganov
1b18b2d7b0
server : be explicit about the pooling type in the tests
ggml-ci
2024-12-18 14:01:22 +02:00
Georgi Gerganov
06e85401b0
server : output embeddings for all tokens when pooling = none
ggml-ci
2024-12-18 14:00:50 +02:00
Georgi Gerganov
89eaf5036a
server : add "tokens" output
ggml-ci
2024-12-18 13:59:47 +02:00
Georgi Gerganov
152610eda9
server : output embeddings for all tokens when pooling = none (#10861)
* server : add "tokens" output

ggml-ci

* server : output embeddings for all tokens when pooling = none

ggml-ci

* server : update readme [no ci]

* server : fix spacing [no ci]

Co-authored-by: Xuan Son Nguyen <thichthat@gmail.com>

* server : be explicit about the pooling type in the tests

ggml-ci

* server : update /embeddings and /v1/embeddings endpoints

ggml-ci

* server : do not normalize embeddings when there is no pooling

ggml-ci

* server : update readme

ggml-ci

* server : fixes

* tests : update server tests

ggml-ci

* server : update readme [no ci]

* server : remove rebase artifact

---------

Co-authored-by: Xuan Son Nguyen <thichthat@gmail.com>
2024-12-18 13:01:41 +02:00
Georgi Gerganov
0e70ba686e
server : add "tokens" output (#10853)
* server : add "tokens" output

ggml-ci

* server : update readme

ggml-ci

* server : return tokens ids only if requested

ggml-ci

* tests : improve "tokens" type check

Co-authored-by: Xuan Son Nguyen <thichthat@gmail.com>

* server : remove "tokens" from the OAI endpoint

ggml-ci

---------

Co-authored-by: Xuan Son Nguyen <thichthat@gmail.com>
2024-12-18 11:05:29 +02:00
Xuan Son Nguyen
46828872c3
server : (embeddings) using same format for "input" and "content" (#10872)
* server : (embeddings) using same format for "input" and "content"

* fix test case

* handle empty input case

* fix test
2024-12-18 10:55:09 +02:00
redbeard
6b064c92b4
docs: Fix HIP (née hipBLAS) in README (#10880)
Related to #10524 / be0e350c references to hipBLAS have been removed
across the repository.  This fixes the link from the repositories
`README.md`.

Signed-off-by: Brian 'redbeard' Harrington <redbeard@dead-city.org>
2024-12-18 10:35:00 +02:00
Diego Devesa
4da69d1abd
Revert "llama : add Falcon3 support (#10864)" (#10876)
This reverts commit 382bc7f2e8.
2024-12-18 01:36:46 +01:00
DAN™
d62b532c52
Use model->gguf_kv for loading the template instead of using the C API. (#10868)
* Bump model_template to 16384 bytes to support larger chat templates.

* Use `model->gguf_kv` for efficiency.
2024-12-17 23:24:22 +01:00
Johannes Gäßler
081b29bd2a
tests: add tests for GGUF (#10830) 2024-12-17 19:09:35 +01:00
Georgi Gerganov
5437d4aaf5
sync : ggml 2024-12-17 18:36:02 +02:00
Georgi Gerganov
78f766768d
cmake : fix "amd64" processor string (whisper/2638) 2024-12-17 18:35:49 +02:00
gn64
8dd19a4812
vulkan : fix soft_max.comp division by zero (whisper/2633)
This change prevents a division by zero error when p.KY is 0.
2024-12-17 18:35:49 +02:00
Daniel Bevenius
130d0c90bd
ggml : remove return from ggml_gallocr_allocate_node (ggml/1048)
This commit removes the return statement from ggml_gallocr_allocate_node
function.

The motivation behind this change is to make the code more readable and
consistent.
2024-12-17 18:35:49 +02:00
Daniel Bevenius
3919da8e33
ggml : add check for grad_accs (ggml/1046)
* ggml : add check for grad_accs

This commit adds a check for grad_accs in ggml_graph_get_grad and
ggml_graph_get_grad_acc functions. This is necessary to avoid segfaults
when grad_accs is not initialized.

The motivation for this change is that I find it nice to be able to
print out a computation graph using ggml_graph_print but this function
segfaults when grad_accs is not initialized:
```console
(gdb) p g1
$2 = (ggml_cgraph *) 0x7ffff66004b0
(gdb) p *g1
$3 = {size = 2048, n_nodes = 1, n_leafs = 2, nodes = 0x7ffff6600500,
grads = 0x0, grad_accs = 0x0, leafs = 0x7ffff6604500,
visited_hash_set = {size = 4099, used = 0x7ffff6610518,
keys = 0x7ffff6608500}, order = GGML_CGRAPH_EVAL_ORDER_LEFT_TO_RIGHT}
(gdb) p ggml_graph_print(g1)
=== GRAPH ===
n_nodes = 1

Program received signal SIGSEGV, Segmentation fault.
0x0000555555579775 in ggml_graph_get_grad
(cgraph=0x7ffff66004b0,node=0x7ffff6600340)
    at /ggml/ggml/src/ggml.c:5990
5990  return igrad != GGML_HASHSET_FULL &&
          ggml_bitset_get(cgraph->visited_hash_set.used, igrad) ?
          cgraph->grads[igrad] : NULL;
```

* squash! ggml : add check for grad_accs

Fix the check in ggml_graph_get_grad. The check was incorrectly using
cgraph->grad_accs instead of cgraph->grads.
2024-12-17 18:35:48 +02:00
Georgi Gerganov
0006f5a74a
ggml : update ggml_backend_cpu_device_supports_op (#10867)
* ggml : fix cpy op for IQ-quants to use reference impl

ggml-ci

* ggml : disable tests involving i-matrix quantization

* ggml : update ggml_backend_cpu_device_supports_op

ggml-ci
2024-12-17 18:35:42 +02:00
krystiancha
05c3a444b8
server : fill usage info in embeddings and rerank responses (#10852)
* server : fill usage info in embeddings response

* server : fill usage info in reranking response
2024-12-17 18:00:24 +02:00
Billel Mokeddem
382bc7f2e8
llama : add Falcon3 support (#10864) 2024-12-17 17:24:56 +02:00
Ruan
4f51968aca
readme : update typos (#10863) 2024-12-17 11:47:20 +02:00
Xuan Son Nguyen
227d7c5a7f
server : (UI) fix missing async generator on safari (#10857)
* server : (UI) fix missing async generator on safari

* fix
2024-12-17 09:52:09 +01:00
Eve
7b1ec53f56
vulkan: bugfixes for small subgroup size systems + llvmpipe test (#10809)
* ensure mul mat shaders work on systems with subgroup size less than 32

more fixes

add test

* only s_warptile_mmq needs to be run with 32 threads or more
2024-12-17 06:52:55 +01:00
Zhiyuan Li
160bc039c8
rwkv6: add wkv6 support for Vulkan backend (#10829)
* rwkv_wkv6 vulkan shader

* RWKV_WKV6 Vulkan op tests passed

Signed-off-by: Molly Sophia <mollysophia379@gmail.com>

* Apply code format changes

Signed-off-by: Molly Sophia <mollysophia379@gmail.com>

* add [[unroll]] and remove unnecessary conditions

* add uma support

* fix erros in EditorConfig Checker

---------

Signed-off-by: Molly Sophia <mollysophia379@gmail.com>
Co-authored-by: Molly Sophia <mollysophia379@gmail.com>
2024-12-16 22:00:46 +01:00