llama.cpp

Author	SHA1	Message	Date
Georgi Gerganov	d291c74253	llama : handle no-vocab detokenization	2024-12-18 14:02:26 +02:00
Georgi Gerganov	824fa750d4	llama : update WavTokenizer to non-causal attn	2024-12-18 14:02:26 +02:00
Georgi Gerganov	2033fb7eef	cont [no ci]	2024-12-18 14:02:26 +02:00
Georgi Gerganov	35259e5335	cont ggml-ci	2024-12-18 14:02:26 +02:00
Georgi Gerganov	980d631032	llama : refactor wavtokenizer tensors ggml-ci	2024-12-18 14:02:26 +02:00
Georgi Gerganov	d1ef627c51	tts : fix tensor shapes	2024-12-18 14:02:26 +02:00
Georgi Gerganov	c096bbd8dd	tts : remove hardcoded constants ggml-ci	2024-12-18 14:02:25 +02:00
Georgi Gerganov	e70f140c04	tts : outetts-voc -> wavtokenizer-dec	2024-12-18 14:02:25 +02:00
Georgi Gerganov	befdcd2492	tts : text pre-processing	2024-12-18 14:02:25 +02:00
Georgi Gerganov	3d54be4d84	tts : update default samplers ggml-ci	2024-12-18 14:02:25 +02:00
Georgi Gerganov	1d7c27ca93	tts : fixes	2024-12-18 14:02:25 +02:00
Georgi Gerganov	906a0edb5a	tts : fix sampling + cut initial noise	2024-12-18 14:02:24 +02:00
Georgi Gerganov	2221e54278	tts : add matchematical constant ggml-ci	2024-12-18 14:02:24 +02:00
Georgi Gerganov	d4fa34bdd4	tts : add header + minor fixes ggml-ci	2024-12-18 14:02:24 +02:00
Georgi Gerganov	8329e850cc	tts : minor fix	2024-12-18 14:02:24 +02:00
Georgi Gerganov	db613915de	clip : fix new conv name	2024-12-18 14:02:24 +02:00
Georgi Gerganov	b9a011e123	tts : receive input text and generate codes	2024-12-18 14:02:24 +02:00
Georgi Gerganov	191da330fc	clean-up	2024-12-18 14:02:23 +02:00
Georgi Gerganov	e52797162e	spectrum processing	2024-12-18 14:02:23 +02:00
Georgi Gerganov	5a1c98e8d2	fft	2024-12-18 14:02:23 +02:00
Georgi Gerganov	e728cfd297	compute hann window	2024-12-18 14:02:23 +02:00
Georgi Gerganov	a1f08ad338	fix n_embd + remove llama.cpp hacks	2024-12-18 14:02:23 +02:00
Georgi Gerganov	eb1b70f42a	hann window	2024-12-18 14:02:23 +02:00
Georgi Gerganov	839035d1bb	head	2024-12-18 14:02:22 +02:00
Georgi Gerganov	fe6dd5aa61	convnext	2024-12-18 14:02:22 +02:00
Georgi Gerganov	b3ba05e5bc	layer norm	2024-12-18 14:02:22 +02:00
Georgi Gerganov	435cfd788b	pos net	2024-12-18 14:02:22 +02:00
Georgi Gerganov	3046fde420	attn	2024-12-18 14:02:22 +02:00
Georgi Gerganov	13dd8941a4	resnet	2024-12-18 14:02:22 +02:00
Georgi Gerganov	3d08d62b6c	resnet conv	2024-12-18 14:02:21 +02:00
Georgi Gerganov	5296c96ca8	group norm	2024-12-18 14:02:21 +02:00
Georgi Gerganov	6ef14091c0	first conv	2024-12-18 14:02:21 +02:00
Georgi Gerganov	aac7e04953	extract features	2024-12-18 14:02:21 +02:00
Georgi Gerganov	ff2ea75fb4	wip	2024-12-18 14:02:21 +02:00
Georgi Gerganov	f169965158	llama : add OuteTTS support (wip)	2024-12-18 14:02:20 +02:00
Georgi Gerganov	e65556f174	server : do not normalize embeddings when there is no pooling ggml-ci	2024-12-18 14:02:05 +02:00
Georgi Gerganov	1b18b2d7b0	server : be explicit about the pooling type in the tests ggml-ci	2024-12-18 14:01:22 +02:00
Georgi Gerganov	06e85401b0	server : output embeddings for all tokens when pooling = none ggml-ci	2024-12-18 14:00:50 +02:00
Georgi Gerganov	89eaf5036a	server : add "tokens" output ggml-ci	2024-12-18 13:59:47 +02:00
Georgi Gerganov	152610eda9	server : output embeddings for all tokens when pooling = none (#10861 ) * server : add "tokens" output ggml-ci * server : output embeddings for all tokens when pooling = none ggml-ci * server : update readme [no ci] * server : fix spacing [no ci] Co-authored-by: Xuan Son Nguyen <thichthat@gmail.com> * server : be explicit about the pooling type in the tests ggml-ci * server : update /embeddings and /v1/embeddings endpoints ggml-ci * server : do not normalize embeddings when there is no pooling ggml-ci * server : update readme ggml-ci * server : fixes * tests : update server tests ggml-ci * server : update readme [no ci] * server : remove rebase artifact --------- Co-authored-by: Xuan Son Nguyen <thichthat@gmail.com>	2024-12-18 13:01:41 +02:00
Georgi Gerganov	0e70ba686e	server : add "tokens" output (#10853 ) * server : add "tokens" output ggml-ci * server : update readme ggml-ci * server : return tokens ids only if requested ggml-ci * tests : improve "tokens" type check Co-authored-by: Xuan Son Nguyen <thichthat@gmail.com> * server : remove "tokens" from the OAI endpoint ggml-ci --------- Co-authored-by: Xuan Son Nguyen <thichthat@gmail.com>	2024-12-18 11:05:29 +02:00
Xuan Son Nguyen	46828872c3	server : (embeddings) using same format for "input" and "content" (#10872 ) * server : (embeddings) using same format for "input" and "content" * fix test case * handle empty input case * fix test	2024-12-18 10:55:09 +02:00
redbeard	6b064c92b4	docs: Fix HIP (née hipBLAS) in README (#10880 ) Related to #10524 / `be0e350c` references to hipBLAS have been removed across the repository. This fixes the link from the repositories `README.md`. Signed-off-by: Brian 'redbeard' Harrington <redbeard@dead-city.org>	2024-12-18 10:35:00 +02:00
Diego Devesa	4da69d1abd	Revert "llama : add Falcon3 support (#10864 )" (#10876 ) This reverts commit `382bc7f2e8`.	2024-12-18 01:36:46 +01:00
DAN™	d62b532c52	Use model->gguf_kv for loading the template instead of using the C API. (#10868 ) * Bump model_template to 16384 bytes to support larger chat templates. * Use `model->gguf_kv` for efficiency.	2024-12-17 23:24:22 +01:00
Johannes Gäßler	081b29bd2a	tests: add tests for GGUF (#10830 )	2024-12-17 19:09:35 +01:00
Georgi Gerganov	5437d4aaf5	sync : ggml	2024-12-17 18:36:02 +02:00
Georgi Gerganov	78f766768d	cmake : fix "amd64" processor string (whisper/2638)	2024-12-17 18:35:49 +02:00
gn64	8dd19a4812	vulkan : fix soft_max.comp division by zero (whisper/2633) This change prevents a division by zero error when p.KY is 0.	2024-12-17 18:35:49 +02:00
Daniel Bevenius	130d0c90bd	ggml : remove return from ggml_gallocr_allocate_node (ggml/1048) This commit removes the return statement from ggml_gallocr_allocate_node function. The motivation behind this change is to make the code more readable and consistent.	2024-12-17 18:35:49 +02:00

1 2 3 4 5 ...

4394 commits