llama.cpp

Author	SHA1	Message	Date
ochafik	30dcfaa57a	rm wrong warning in command-r parser (when normal text)	2025-02-09 18:13:32 +00:00
ochafik	91542ca245	tool-calls: allow r1 output to miss <think> opening tag (since latest template update adds it)	2025-02-09 15:50:21 +00:00
ochafik	95cddfd8fb	rm thoughts from generic parser	2025-02-09 01:27:58 +00:00
ochafik	c0f972bb45	Use --reasoning-format, remove forced thinking for now	2025-02-08 17:58:33 +00:00
Olivier Chafik	994301da12	use existing string_strip	2025-02-05 16:33:16 +00:00
Olivier Chafik	e6d9b52480	align Command R7B w/ --think / reasoning_content behaviour	2025-02-05 15:47:37 +00:00
Olivier Chafik	3841a163ef	fix compiler warning about parens	2025-02-05 13:05:27 +00:00
ochafik	f3e9f8b62a	fix test_thoughts	2025-02-05 12:34:27 +00:00
ochafik	9d7c3cc51b	--think to force any model to return reasoning_content (or just parse <think> for deepseek r1)	2025-02-05 12:16:37 +00:00
Olivier Chafik	933f7a186e	Merge branch 'master' into r1-toolcall	2025-02-04 15:56:25 +00:00
Olivier Chafik	db288b60cb	`tool-call`: command r7b fix for normal responses (#11608 ) * fix command r7b normal response regex + add to server test * test multiline non-tool-call responses in test-chat	2025-02-04 15:48:53 +00:00
Olivier Chafik	39c1d8163b	return thoughts in reasoning_content field	2025-02-04 11:37:09 +00:00
ochafik	d1b66910c5	r1: revert making <｜tool▁calls▁begin｜> optional as somehow sampling triggers us on "<｜tool▁call▁begin｜><", which is already invalid per the grammar	2025-02-04 10:38:03 +00:00
ochafik	0db9881285	Fix r1 grammar since we made <｜tool▁calls▁begin｜> optional (triggering on just <｜tool▁call▁begin｜> for 7B's sake)	2025-02-04 10:30:10 +00:00
ochafik	b5b117fa1c	Merge branch 'sync-minja-4' into r1-toolcall	2025-02-04 09:45:27 +00:00
ochafik	21f207156f	Update chat.cpp	2025-02-04 05:16:23 +00:00
ochafik	438ce0b8a1	fix test-chat	2025-02-04 04:51:36 +00:00
ochafik	1f5ec59809	ensure deepseek r1 thoughts parsed even w/o tool calls	2025-02-04 04:48:08 +00:00
ochafik	d44eb95c67	tool-call: ensure we don't return content when there are tool calls / warn	2025-02-04 04:18:49 +00:00
ochafik	d43e4f6c22	Merge branch 'sync-minja-4' into r1-toolcall	2025-02-04 04:05:02 +00:00
ochafik	f12e3507f7	Update chat.cpp	2025-02-04 04:02:18 +00:00
ochafik	09caa63451	`sync`: minja `182de30cda`	2025-02-04 03:52:59 +00:00
ochafik	f0154a6479	Fix / test models/templates/llama-cpp-deepseek-r1.jinja	2025-02-04 03:09:15 +00:00
ochafik	a682d1216d	fix / test parsing of r1 parser	2025-02-04 02:23:31 +00:00
ochafik	18a11f43f0	tool-call: r1: fix grammar	2025-02-04 01:12:44 +00:00
ochafik	e84ee88f50	r1: fix inadvertent newline in grammar before <｜tool▁call▁end｜>	2025-02-04 00:36:38 +00:00
Olivier Chafik	ce28224de8	tool-call: r1: add one more trigger approx "<｜tool calls begin｜>"	2025-02-04 00:28:40 +00:00
Olivier Chafik	bff549deb6	simplify hack to fix original template's backfill from minja	2025-02-04 00:14:48 +00:00
Olivier Chafik	30ea3591c9	update to minja's new api	2025-02-03 23:53:27 +00:00
Olivier Chafik	1c302e18ba	simpler hacky fixes for original broken template (+ fix minja example syntax polyfill)	2025-02-03 20:34:44 +00:00
Olivier Chafik	c6214ee9d6	rm unneeded vocab	2025-02-03 19:59:50 +00:00
Olivier Chafik	7dc271fb37	tool-calls: add deepseek r1 template + accommodate broken official template slightly better	2025-02-03 19:59:33 +00:00
Olivier Chafik	569610ee77	tool-calls: accommodate variety of wrong tool call opening tags both Qwen 32B and 7B distills like to spit out	2025-02-03 18:57:55 +00:00
Olivier Chafik	df3474e2c2	tool-calls: r1: add missing <｜tool▁calls▁end｜> to grammar!	2025-02-03 17:33:14 +00:00
ochafik	a76073cf88	minimize diffs	2025-02-03 10:58:52 +00:00
ochafik	04be723b33	tool-call: fix command-r7b parsing when response is multiline	2025-02-03 02:24:30 +00:00
ochafik	c80cb30938	update logs	2025-02-03 02:24:30 +00:00
ochafik	130ca222c9	DeepSeek R1: parse thoughts / return in separate field in API (non streamed mode)	2025-02-03 02:24:30 +00:00
ochafik	87de852b7f	pass vocab to common_chat_params_init	2025-02-03 02:24:30 +00:00
Olivier Chafik	bfcce4d693	`tool-call`: support Command R7B (+ return tool_plan "thoughts" in API) (#11585 ) * `tool-call`: support Command R7B (w/ tool_plan return) * `tool-call`: cleaner preservation of tokens + warn when likely bad chat template override * `tool-call`: test cleanup / handle lazy grammar triggers	2025-02-02 09:25:38 +00:00
Olivier Chafik	a83f528688	`tool-call`: fix llama 3.x and functionary 3.2, play nice w/ pydantic_ai package, update readme (#11539 ) * An empty tool_call_id is better than none! * sync: minja (tool call name optional https://github.com/google/minja/pull/36) * Force-disable parallel_tool_calls if template doesn't support it * More debug logs * Llama 3.x tools: accept / trigger on more varied spaced outputs * Fix empty content for functionary v3.2 tool call * Add proper tool call docs to server README * readme: function calling is supported now * Apply suggestions from code review Co-authored-by: Georgi Gerganov <ggerganov@gmail.com> --------- Co-authored-by: Georgi Gerganov <ggerganov@gmail.com>	2025-01-31 14:15:25 +00:00
Olivier Chafik	8b576b6c55	Tool call support (generic + native for Llama, Functionary, Hermes, Mistral, Firefunction, DeepSeek) w/ lazy grammars (#9639 ) --------- Co-authored-by: Xuan Son Nguyen <thichthat@gmail.com> Co-authored-by: Georgi Gerganov <ggerganov@gmail.com> Co-authored-by: Xuan Son Nguyen <son@huggingface.co>	2025-01-30 19:13:58 +00:00

42 commits