llama.cpp

Author	SHA1	Message	Date
Georgi Gerganov	befdcd2492	tts : text pre-processing	2024-12-18 14:02:25 +02:00
Georgi Gerganov	3d54be4d84	tts : update default samplers ggml-ci	2024-12-18 14:02:25 +02:00
Georgi Gerganov	1d7c27ca93	tts : fixes	2024-12-18 14:02:25 +02:00
Georgi Gerganov	906a0edb5a	tts : fix sampling + cut initial noise	2024-12-18 14:02:24 +02:00
Georgi Gerganov	2221e54278	tts : add matchematical constant ggml-ci	2024-12-18 14:02:24 +02:00
Georgi Gerganov	d4fa34bdd4	tts : add header + minor fixes ggml-ci	2024-12-18 14:02:24 +02:00
Georgi Gerganov	8329e850cc	tts : minor fix	2024-12-18 14:02:24 +02:00
Georgi Gerganov	db613915de	clip : fix new conv name	2024-12-18 14:02:24 +02:00
Georgi Gerganov	b9a011e123	tts : receive input text and generate codes	2024-12-18 14:02:24 +02:00
Georgi Gerganov	191da330fc	clean-up	2024-12-18 14:02:23 +02:00
Georgi Gerganov	e52797162e	spectrum processing	2024-12-18 14:02:23 +02:00
Georgi Gerganov	5a1c98e8d2	fft	2024-12-18 14:02:23 +02:00
Georgi Gerganov	e728cfd297	compute hann window	2024-12-18 14:02:23 +02:00
Georgi Gerganov	a1f08ad338	fix n_embd + remove llama.cpp hacks	2024-12-18 14:02:23 +02:00
Georgi Gerganov	eb1b70f42a	hann window	2024-12-18 14:02:23 +02:00
Georgi Gerganov	839035d1bb	head	2024-12-18 14:02:22 +02:00
Georgi Gerganov	fe6dd5aa61	convnext	2024-12-18 14:02:22 +02:00
Georgi Gerganov	b3ba05e5bc	layer norm	2024-12-18 14:02:22 +02:00
Georgi Gerganov	435cfd788b	pos net	2024-12-18 14:02:22 +02:00
Georgi Gerganov	3046fde420	attn	2024-12-18 14:02:22 +02:00
Georgi Gerganov	13dd8941a4	resnet	2024-12-18 14:02:22 +02:00
Georgi Gerganov	3d08d62b6c	resnet conv	2024-12-18 14:02:21 +02:00
Georgi Gerganov	5296c96ca8	group norm	2024-12-18 14:02:21 +02:00
Georgi Gerganov	6ef14091c0	first conv	2024-12-18 14:02:21 +02:00
Georgi Gerganov	aac7e04953	extract features	2024-12-18 14:02:21 +02:00
Georgi Gerganov	ff2ea75fb4	wip	2024-12-18 14:02:21 +02:00
Georgi Gerganov	f169965158	llama : add OuteTTS support (wip)	2024-12-18 14:02:20 +02:00
Georgi Gerganov	e65556f174	server : do not normalize embeddings when there is no pooling ggml-ci	2024-12-18 14:02:05 +02:00
Georgi Gerganov	1b18b2d7b0	server : be explicit about the pooling type in the tests ggml-ci	2024-12-18 14:01:22 +02:00
Georgi Gerganov	06e85401b0	server : output embeddings for all tokens when pooling = none ggml-ci	2024-12-18 14:00:50 +02:00
Georgi Gerganov	89eaf5036a	server : add "tokens" output ggml-ci	2024-12-18 13:59:47 +02:00
Georgi Gerganov	152610eda9	server : output embeddings for all tokens when pooling = none (#10861 ) * server : add "tokens" output ggml-ci * server : output embeddings for all tokens when pooling = none ggml-ci * server : update readme [no ci] * server : fix spacing [no ci] Co-authored-by: Xuan Son Nguyen <thichthat@gmail.com> * server : be explicit about the pooling type in the tests ggml-ci * server : update /embeddings and /v1/embeddings endpoints ggml-ci * server : do not normalize embeddings when there is no pooling ggml-ci * server : update readme ggml-ci * server : fixes * tests : update server tests ggml-ci * server : update readme [no ci] * server : remove rebase artifact --------- Co-authored-by: Xuan Son Nguyen <thichthat@gmail.com>	2024-12-18 13:01:41 +02:00
Georgi Gerganov	0e70ba686e	server : add "tokens" output (#10853 ) * server : add "tokens" output ggml-ci * server : update readme ggml-ci * server : return tokens ids only if requested ggml-ci * tests : improve "tokens" type check Co-authored-by: Xuan Son Nguyen <thichthat@gmail.com> * server : remove "tokens" from the OAI endpoint ggml-ci --------- Co-authored-by: Xuan Son Nguyen <thichthat@gmail.com>	2024-12-18 11:05:29 +02:00
Xuan Son Nguyen	46828872c3	server : (embeddings) using same format for "input" and "content" (#10872 ) * server : (embeddings) using same format for "input" and "content" * fix test case * handle empty input case * fix test	2024-12-18 10:55:09 +02:00
redbeard	6b064c92b4	docs: Fix HIP (née hipBLAS) in README (#10880 ) Related to #10524 / `be0e350c` references to hipBLAS have been removed across the repository. This fixes the link from the repositories `README.md`. Signed-off-by: Brian 'redbeard' Harrington <redbeard@dead-city.org>	2024-12-18 10:35:00 +02:00
Diego Devesa	4da69d1abd	Revert "llama : add Falcon3 support (#10864 )" (#10876 ) This reverts commit `382bc7f2e8`.	2024-12-18 01:36:46 +01:00
DAN™	d62b532c52	Use model->gguf_kv for loading the template instead of using the C API. (#10868 ) * Bump model_template to 16384 bytes to support larger chat templates. * Use `model->gguf_kv` for efficiency.	2024-12-17 23:24:22 +01:00
Johannes Gäßler	081b29bd2a	tests: add tests for GGUF (#10830 )	2024-12-17 19:09:35 +01:00
Georgi Gerganov	5437d4aaf5	sync : ggml	2024-12-17 18:36:02 +02:00
Georgi Gerganov	78f766768d	cmake : fix "amd64" processor string (whisper/2638)	2024-12-17 18:35:49 +02:00
gn64	8dd19a4812	vulkan : fix soft_max.comp division by zero (whisper/2633) This change prevents a division by zero error when p.KY is 0.	2024-12-17 18:35:49 +02:00
Daniel Bevenius	130d0c90bd	ggml : remove return from ggml_gallocr_allocate_node (ggml/1048) This commit removes the return statement from ggml_gallocr_allocate_node function. The motivation behind this change is to make the code more readable and consistent.	2024-12-17 18:35:49 +02:00
Daniel Bevenius	3919da8e33	ggml : add check for grad_accs (ggml/1046) * ggml : add check for grad_accs This commit adds a check for grad_accs in ggml_graph_get_grad and ggml_graph_get_grad_acc functions. This is necessary to avoid segfaults when grad_accs is not initialized. The motivation for this change is that I find it nice to be able to print out a computation graph using ggml_graph_print but this function segfaults when grad_accs is not initialized: ```console (gdb) p g1 $2 = (ggml_cgraph ) 0x7ffff66004b0 (gdb) p g1 $3 = {size = 2048, n_nodes = 1, n_leafs = 2, nodes = 0x7ffff6600500, grads = 0x0, grad_accs = 0x0, leafs = 0x7ffff6604500, visited_hash_set = {size = 4099, used = 0x7ffff6610518, keys = 0x7ffff6608500}, order = GGML_CGRAPH_EVAL_ORDER_LEFT_TO_RIGHT} (gdb) p ggml_graph_print(g1) === GRAPH === n_nodes = 1 Program received signal SIGSEGV, Segmentation fault. 0x0000555555579775 in ggml_graph_get_grad (cgraph=0x7ffff66004b0,node=0x7ffff6600340) at /ggml/ggml/src/ggml.c:5990 5990 return igrad != GGML_HASHSET_FULL && ggml_bitset_get(cgraph->visited_hash_set.used, igrad) ? cgraph->grads[igrad] : NULL; ``` * squash! ggml : add check for grad_accs Fix the check in ggml_graph_get_grad. The check was incorrectly using cgraph->grad_accs instead of cgraph->grads.	2024-12-17 18:35:48 +02:00
Georgi Gerganov	0006f5a74a	ggml : update ggml_backend_cpu_device_supports_op (#10867 ) * ggml : fix cpy op for IQ-quants to use reference impl ggml-ci * ggml : disable tests involving i-matrix quantization * ggml : update ggml_backend_cpu_device_supports_op ggml-ci	2024-12-17 18:35:42 +02:00
krystiancha	05c3a444b8	server : fill usage info in embeddings and rerank responses (#10852 ) * server : fill usage info in embeddings response * server : fill usage info in reranking response	2024-12-17 18:00:24 +02:00
Billel Mokeddem	382bc7f2e8	llama : add Falcon3 support (#10864 )	2024-12-17 17:24:56 +02:00
Ruan	4f51968aca	readme : update typos (#10863 )	2024-12-17 11:47:20 +02:00
Xuan Son Nguyen	227d7c5a7f	server : (UI) fix missing async generator on safari (#10857 ) * server : (UI) fix missing async generator on safari * fix	2024-12-17 09:52:09 +01:00
Eve	7b1ec53f56	vulkan: bugfixes for small subgroup size systems + llvmpipe test (#10809 ) * ensure mul mat shaders work on systems with subgroup size less than 32 more fixes add test * only s_warptile_mmq needs to be run with 32 threads or more	2024-12-17 06:52:55 +01:00
Zhiyuan Li	160bc039c8	rwkv6: add wkv6 support for Vulkan backend (#10829 ) * rwkv_wkv6 vulkan shader * RWKV_WKV6 Vulkan op tests passed Signed-off-by: Molly Sophia <mollysophia379@gmail.com> * Apply code format changes Signed-off-by: Molly Sophia <mollysophia379@gmail.com> * add [[unroll]] and remove unnecessary conditions * add uma support * fix erros in EditorConfig Checker --------- Signed-off-by: Molly Sophia <mollysophia379@gmail.com> Co-authored-by: Molly Sophia <mollysophia379@gmail.com>	2024-12-16 22:00:46 +01:00

1 2 3 4 5 ...

4386 commits