llama.cpp

Author	SHA1	Message	Date
ochafik	d274ffcc95	build: Add missing optional include for gcc	2025-01-28 09:29:31 +00:00
ochafik	0a51e514f6	Update test-chat-handler.cpp	2025-01-28 09:24:35 +00:00
Olivier Chafik	2f99236f77	Tool-call: do last partial parse upon limit stop	2025-01-28 09:23:19 +00:00
Olivier Chafik	6d5682909f	Cleanup dead code in llama_3_1 tool call code	2025-01-28 09:22:26 +00:00
Olivier Chafik	62717145f7	Allow tool use + streaming	2025-01-28 09:22:03 +00:00
Michael Engel	2b8525d5c8	Handle missing model in CLI parameters for llama-run (#11399 ) The HTTP client in llama-run only prints an error in case the download of a resource failed. If the model name in the CLI parameter list is missing, this causes the application to crash. In order to prevent this, a check for the required model parameter has been added and errors for resource downloads get propagated to the caller. Signed-off-by: Michael Engel <mengel@redhat.com>	2025-01-28 08:32:40 +00:00
ochafik	ef9efc9ed3	Fix Llama 3.1 (incl. constrained builtin tools e.g. `<\|python_tag\|>foo.call(arg=vallue)`)	2025-01-28 01:04:06 +00:00
ochafik	2d607f1a68	Update test-chat-handler.cpp	2025-01-27 23:29:28 +00:00
ochafik	b565ab2ab1	comment out broken tests in test_tool_call.py	2025-01-27 23:02:15 +00:00
ochafik	cafea60922	Split e2e test_tool_call from test_chat_completion	2025-01-27 22:46:33 +00:00
ochafik	90effb845f	Pass grammar laziness all the way down to sampler (need to print special trigger tokens e.g. for Nemo even w/ tool_choice=required)	2025-01-27 22:46:17 +00:00
ochafik	ad229783c5	updated tool call example to be less ambiguous (deepseek likes to rant about hello world)	2025-01-27 22:44:44 +00:00
ochafik	fa065eb095	Rehabilitate test_format_detection	2025-01-27 20:46:03 +00:00
ochafik	add9124115	fix test-chat-handler grammar tests	2025-01-27 20:13:09 +00:00
Eric Curtin	a4417ddda9	Add new hf protocol for ollama (#11449 ) https://huggingface.co/docs/hub/en/ollama Signed-off-by: Eric Curtin <ecurtin@redhat.com>	2025-01-27 19:36:10 +01:00
ochafik	118f799ae4	DeepSeek-R1: implement grammar constraints	2025-01-27 17:52:46 +00:00
ochafik	92ac336dfa	Prepare DeepSeek-R1-Distill-Llama-8B support	2025-01-27 17:26:43 +00:00
ochafik	09971e626c	Update test_chat_completion.py	2025-01-27 15:43:03 +00:00
ochafik	67709552ad	tool-call: compact json output to cap # tokens generated	2025-01-27 15:42:27 +00:00
ochafik	57f40e366b	tool-call: fix lazy grammar & mixed content + tool calls parsing	2025-01-27 15:41:54 +00:00
ochafik	2efa0c27bf	tool-call: add weather tool e2e tests	2025-01-27 15:02:09 +00:00
ochafik	15ec01e896	jinja: only add special tokens if template doesn't seem to handle them	2025-01-27 14:28:11 +00:00
ochafik	da606d8d41	tool-call: remove nonsensical code_interpreter code	2025-01-27 14:19:20 +00:00
Haus1	d6d24cd9ed	AMD: parse the architecture as supplied by gcnArchName (#11244 ) The value provided by minor doesn't include stepping for AMD, parse the value returned by gcnArchName instead to retrieve an accurate ID.	2025-01-27 14:58:17 +01:00
lexasub	a5203b4465	llama : minor fixes for up llama load model speed (#11448 ) * impl::load change map bpe_ranks to onordered map for reduce time of impl::load on 30% * llama_model_loader::init_mapping - replace new llama_mmap to std::make_unique<llama_mmap> for clean code & reduce (/2) time of running init_mappings * Update src/llama-vocab.cpp --------- Co-authored-by: lexasub <empty@empty.ru> Co-authored-by: Diego Devesa <slarengh@gmail.com>	2025-01-27 14:42:09 +01:00
ochafik	bddc1bebcc	tool-call: fix special handling of special trigger tokens (Nemo)	2025-01-27 11:37:41 +00:00
Johannes Gäßler	df984e0147	llama: refactor llama_decode_impl (#11381 )	2025-01-27 12:07:12 +01:00
Ihar Hrachyshka	acd38efee3	metal: Handle null returned from MTLCreateSystemDefaultDevice() (#11441 ) This fixes segmentation fault error when running tests when no metal devices are available (for example, when not linked with Core Graphics framework or otherwise).	2025-01-27 09:41:59 +02:00
ochafik	ca0c837b6a	nits	2025-01-27 01:08:29 +00:00
ochafik	f7078cab36	tool-call: fix functionary v3.1 required test	2025-01-26 23:23:09 +00:00
Xuan Son Nguyen	caf773f249	docker : fix ARM build and Vulkan build (#11434 ) * ci : do not fail-fast for docker * build arm64/amd64 separatedly * fix pip * no fast fail * vulkan: try jammy	2025-01-26 22:45:32 +01:00
ochafik	5ec4c5e4d3	reshuffle chat handlers	2025-01-26 21:38:07 +00:00
ochafik	43385b2ff2	sync: minja	2025-01-26 21:36:25 +00:00
Georgi Gerganov	178a7eb952	metal : use residency sets (#11427 ) * metal : use residency sets ggml-ci * metal : restore commandBufferWithUnretainedReferences calls [no ci] * metal : release descriptors ggml-ci * metal : check env GGML_METAL_NO_RESIDENCY ggml-ci * metal : fix build + clean-up ggml-ci	2025-01-26 20:06:16 +02:00
Nuno	6f53d8a6b4	docker: add missing vulkan library to base layer and update to 24.04 (#11422 ) Signed-off-by: rare-magma <rare-magma@posteo.eu>	2025-01-26 18:22:43 +01:00
bandoti	19f65187cb	cmake: add ggml find package (#11369 ) * Add initial ggml cmake package * Add build numbers to ggml find-package * Expand variables with GGML_ prefix * Guard against adding to cache variable twice * Add git to msys2 workflow * Handle ggml-cpu-* variants * Link ggml/ggml-base libraries to their targets * Replace main-cmake-pkg with simple-cmake-pkg * Interface features require c_std_90 * Fix typo * Removed unnecessary bracket from status message * Update examples/simple-cmake-pkg/README.md Co-authored-by: Georgi Gerganov <ggerganov@gmail.com> * Update examples/simple-cmake-pkg/README.md Co-authored-by: Georgi Gerganov <ggerganov@gmail.com> --------- Co-authored-by: Georgi Gerganov <ggerganov@gmail.com>	2025-01-26 12:07:48 -04:00
ochafik	11594557e3	Merge branch 'tool-call' into tool-call-handler	2025-01-26 15:32:53 +00:00
ochafik	3f3fc03983	nit: trailing spaces	2025-01-26 15:32:13 +00:00
Frank Mai	1d8ee06000	rpc: fix register position (#11424 ) Signed-off-by: thxCode <thxcode0824@gmail.com>	2025-01-26 16:20:34 +01:00
Georgi Gerganov	2cc9b8c32c	readme : update hot topics	2025-01-26 14:30:15 +02:00
Jeff Bolz	f35726c2fb	build: apply MSVC /bigobj option to c/cpp files only (#11423 )	2025-01-26 03:10:03 +01:00
Jeff Bolz	4a75d19376	vulkan: compile shaders on-demand (#11406 ) Reduce first-run startup time and memory consumption. Should fix #11339.	2025-01-25 22:29:57 +01:00
uvos	26771a1491	Hip: disable VMM on hip as it seams that it dosent work in some configurations (#11420 )	2025-01-25 21:01:12 +01:00
Jeff Bolz	ca6baf76c1	build: add /bigobj to MSVC build (#11407 )	2025-01-25 11:26:37 -06:00
Diego Devesa	6e264a905b	docker : add GGML_CPU_ARM_ARCH arg to select ARM architecture to build for (#11419 )	2025-01-25 17:22:41 +01:00
Xuan Son Nguyen	49b0e3cec4	server : fix cleaning up stream task (#11418 ) * server : fix cleaning up stream task * one more spot	2025-01-25 16:36:44 +01:00
Diego Devesa	20a758155b	docker : fix CPU ARM build (#11403 ) * docker : fix CPU ARM build * add CURL to other builds	2025-01-25 15:22:29 +01:00
Georgi Gerganov	00c24acb2a	ci : fix line breaks on windows builds (#11409 ) * ci : fix line breaks on windows builds * cont : another try * ci : fix powershell line breaks	2025-01-25 13:36:48 +02:00
Olivier Chafik	51b7aab841	Update test_chat_completion.py	2025-01-25 04:57:40 +00:00
Olivier Chafik	a6463c1e35	jinja: don't add bos when jinja enabled	2025-01-25 04:52:42 +00:00

1 2 3 4 5 ...

4934 commits