llama.cpp

Author	SHA1	Message	Date
Georgi Gerganov	589b48d41e	contrib : add Resources section (#9675 )	2024-09-29 14:38:18 +03:00
ochafik	9ac4b04aa2	`tool-call`: add fs_list_files to common, w/ win32 impl for msys2 build	2024-09-29 00:42:52 +01:00
ochafik	cb7912ee74	`chat-template`: add phi-3.5-vision-instruct	2024-09-29 00:33:19 +01:00
ochafik	8738d94bbd	`minja`: qualify std::nullptr_t type for msys2 build	2024-09-29 00:18:22 +01:00
ochafik	c87c12168a	`tool-call`: fix memory leak in test	2024-09-28 23:44:28 +01:00
ochafik	22493c8e9e	`tests`: fix test-chat-template run from build	2024-09-28 23:31:23 +01:00
ochafik	ad6719e2a7	`tests`: fix typo	2024-09-28 23:26:19 +01:00
ochafik	a072f30a8d	`tests`: attempt to find assets for tests run from build subfolder	2024-09-28 23:15:36 +01:00
ochafik	bc3e0c0830	`tool-call`: Qwen 2.5 Instruct also requires object arguments	2024-09-28 23:05:35 +01:00
ochafik	b10ef04d8d	`chat-template`: tweak --chat-template error message when --jinja is set	2024-09-28 22:36:38 +01:00
ochafik	dbda025f87	`tool-call`: test messages -> template -> grammar -> tool call parser	2024-09-28 22:32:47 +01:00
ochafik	0ae1112faa	`agent`: try to fix pyright lint	2024-09-28 20:10:08 +01:00
ochafik	1b32ac129f	`chat-template`: fix test-arg	2024-09-28 20:06:10 +01:00
ochafik	9358d1f62c	`minja`: fix gcc8 build of test	2024-09-28 19:50:08 +01:00
ochafik	e6be59c2a0	`antiprompts`: fix gcc8 build (avoid recursive struct)	2024-09-28 19:39:52 +01:00
ochafik	ef2a020276	`tool-call`: make agent async	2024-09-28 19:11:09 +01:00
ochafik	05bbba9f8a	`tool-call`: only match json eagerly for Llama 3.2	2024-09-28 19:05:10 +01:00
ochafik	6e0053a81b	`chat-template`: enumerate files w/ C API rather than private using std::__fs::filesystem	2024-09-28 18:47:11 +01:00
ochafik	c657857e21	`tool-call`: cleanup tools.py	2024-09-28 18:33:40 +01:00
ochafik	55cf337560	`tool-call`: better error reporting for server tests	2024-09-28 18:33:40 +01:00
ochafik	7cef90cf9c	`tool-call`: more eager function call parsing for Functionary & Llama (give a chance to 3B model)	2024-09-28 18:33:40 +01:00
ochafik	8b2cf3509f	`tool-call`: fix grammar trigger crash	2024-09-28 18:30:01 +01:00
ochafik	d983516f40	`tool-call`: let the tool call handler expand chat template, moving builtin_tools down as extra_context	2024-09-28 17:46:36 +01:00
ochafik	0c85bc7a8f	`tool-call`: test tool call style detection	2024-09-28 17:43:09 +01:00
Georgi Gerganov	f4d2b8846a	llama : add reranking support (#9510 ) * py : add XLMRobertaForSequenceClassification [no ci] * py : fix scalar-tensor conversion [no ci] * py : fix position embeddings chop [no ci] * llama : read new cls tensors [no ci] * llama : add classigication head (wip) [no ci] * llama : add "rank" pooling type ggml-ci * server : add rerank endpoint ggml-ci * llama : aboud ggml_repeat during classification * rerank : cleanup + comments * server : accept /rerank endpoint in addition to /v1/rerank [no ci] * embedding : parse special tokens * jina : support v1 reranker * vocab : minor style ggml-ci * server : initiate tests for later ggml-ci * server : add docs * llama : add comment [no ci] * llama : fix uninitialized tensors * ci : add rerank tests ggml-ci * add reranking test * change test data * Update examples/server/server.cpp Co-authored-by: Xuan Son Nguyen <thichthat@gmail.com> * add `--reranking` argument * update server docs * llama : fix comment [no ci] ggml-ci --------- Co-authored-by: Xuan Son Nguyen <son@huggingface.co> Co-authored-by: Xuan Son Nguyen <thichthat@gmail.com>	2024-09-28 17:42:03 +03:00
slaren	1b2f992cd2	test-backend-ops : use flops for some performance tests (#9657 ) * test-backend-ops : use flops for some performance tests - parallelize tensor quantization - use a different set of cases for performance and correctness tests - run each test for at least one second	2024-09-28 14:32:46 +02:00
Georgi Gerganov	739842703e	llama : add comment about thread-safety [no ci] (#9449 )	2024-09-28 15:13:42 +03:00
Zhenwei Jin	6102037bbb	vocab : refactor tokenizer to reduce init overhead (#9449 ) * refactor tokenizer * llama : make llm_tokenizer more private ggml-ci * refactor tokenizer * refactor tokenizer * llama : make llm_tokenizer more private ggml-ci * remove unused files * remove unused fileds to avoid unused filed build error * avoid symbol link error * Update src/llama.cpp * Update src/llama.cpp --------- Co-authored-by: Georgi Gerganov <ggerganov@gmail.com>	2024-09-28 15:10:58 +03:00
nopperl	9a913110cf	llama : add support for Chameleon (#8543 ) * convert chameleon hf to gguf * add chameleon tokenizer tests * fix lint * implement chameleon graph * add swin norm param * return qk norm weights and biases to original format * implement swin norm * suppress image token output * rem tabs * add comment to conversion * fix ci * check for k norm separately * adapt to new lora implementation * fix layer input for swin norm * move swin_norm in gguf writer * add comment regarding special token regex in chameleon pre-tokenizer * Update src/llama.cpp Co-authored-by: compilade <git@compilade.net> * fix punctuation regex in chameleon pre-tokenizer (@compilade) Co-authored-by: compilade <git@compilade.net> * fix lint * trigger ci --------- Co-authored-by: compilade <git@compilade.net>	2024-09-28 15:08:43 +03:00
Aarni Koskela	43bcdd9703	readme : add tool (#9655 )	2024-09-28 15:07:14 +03:00
Dan Johansson	6a0f779484	ggml : add run-time detection of neon, i8mm and sve (#9331 ) * ggml: Added run-time detection of neon, i8mm and sve Adds run-time detection of the Arm instructions set features neon, i8mm and sve for Linux and Apple build targets. * ggml: Extend feature detection to include non aarch64 Arm arch * ggml: Move definition of ggml_arm_arch_features to the global data section	2024-09-28 15:06:16 +03:00
Markus Tavenrath	89f9944981	Enable use to the rebar feature to upload buffers to the device. (#9251 )	2024-09-28 12:05:05 +02:00
ochafik	887951beb0	`minja`: generate chat goldens w/ fixed date to support Llama-3.2-3B-Instruct (uses strftime_now)	2024-09-27 19:52:15 +01:00
ochafik	701b664551	`minja`: add `indent` filter to support command-r-plus's chat templates	2024-09-27 19:00:14 +01:00
Georgi Gerganov	b5de3b74a5	readme : update hot topics	2024-09-27 20:57:51 +03:00
ochafik	0093a5e527	`minja`: fix identifiers parsing (when start w/ not/is/etc) and lstrip_blocks corner case (needed by DeepSeek-V2.5	2024-09-27 18:30:44 +01:00
Borislav Stanimirov	44f59b4301	cmake : add option for common library (#9661 )	2024-09-27 10:42:06 +03:00
ochafik	2f25ee30ef	Update README.md	2024-09-27 07:18:07 +01:00
ochafik	86e4f99092	Update README.md	2024-09-27 07:15:25 +01:00
ochafik	e62b5de3cf	`tool-call`: fix functionary-small-3.2 (first tool starts w/ name\n, subsequent are >>>name\n)	2024-09-27 07:06:33 +01:00
ochafik	e33b342da7	`tool-call`: fix passing of tools to template + allow agent to finish	2024-09-27 06:24:22 +01:00
ochafik	f62e688387	`tool-call`: fix crash / test non-tool call case (added llama_sampler_is_grammar_empty)	2024-09-27 06:04:41 +01:00
ochafik	0abfa36ca7	`tool-call`: move usage examples to examples/agent	2024-09-27 05:10:30 +01:00
ochafik	6610ecf965	`server`: rm bad debug code	2024-09-27 04:07:35 +01:00
ochafik	27cd07a056	`json`: fix grammar conversion typo	2024-09-27 03:57:48 +01:00
ochafik	9295ca95db	`tool-call`: fix agent type lints	2024-09-27 03:53:56 +01:00
ochafik	1e5c0e747e	`chat-template`: fix jinja tests (make safe a passthrough)	2024-09-27 03:50:04 +01:00
ochafik	f9c1743bb5	`minja`: fix iterables	2024-09-27 03:36:49 +01:00
ochafik	8299fac07c	`tool-call`: adapt very simple agent + docker isolation from https://github.com/ggerganov/llama.cpp/pull/6389	2024-09-26 21:07:46 +01:00
ochafik	10f9fe8d49	`tool-call`: fix tool call return format	2024-09-26 21:01:04 +01:00

... 2 3 4 5 6 ...

4066 commits