Olivier Chafik 
								
							 
						 
						
							
							
							
							
								
							
							
								ce28224de8 
								
							 
						 
						
							
							
								
								tool-call: r1: add one more trigger approx "<|tool calls begin|>"  
							
							
							
						 
						
							2025-02-04 00:28:40 +00:00 
							
								 
							
							
								 
							
						 
					 
				
					
						
							
								
								
									Olivier Chafik 
								
							 
						 
						
							
							
							
							
								
							
							
								bff549deb6 
								
							 
						 
						
							
							
								
								simplify hack to fix original template's backfill from minja  
							
							
							
						 
						
							2025-02-04 00:14:48 +00:00 
							
								 
							
							
								 
							
						 
					 
				
					
						
							
								
								
									Olivier Chafik 
								
							 
						 
						
							
							
							
							
								
							
							
								bbd45bf6a2 
								
							 
						 
						
							
							
								
								sync: minja  
							
							
							
						 
						
							2025-02-04 00:14:15 +00:00 
							
								 
							
							
								 
							
						 
					 
				
					
						
							
								
								
									Olivier Chafik 
								
							 
						 
						
							
							
							
							
								
							
							
								30ea3591c9 
								
							 
						 
						
							
							
								
								update to minja's new api  
							
							
							
						 
						
							2025-02-03 23:53:27 +00:00 
							
								 
							
							
								 
							
						 
					 
				
					
						
							
								
								
									Olivier Chafik 
								
							 
						 
						
							
							
							
							
								
							
							
								11c1f0c7d4 
								
							 
						 
						
							
							
								
								actually we want eos_token in the template to infer tool call examples, explicitly skipped in new template options  
							
							
							
						 
						
							2025-02-03 23:52:28 +00:00 
							
								 
							
							
								 
							
						 
					 
				
					
						
							
								
								
									Olivier Chafik 
								
							 
						 
						
							
							
							
							
								
							
							
								108da907f0 
								
							 
						 
						
							
							
								
								sync: minja  https://github.com/google/minja/pull/46  
							
							
							
						 
						
							2025-02-03 23:31:49 +00:00 
							
								 
							
							
								 
							
						 
					 
				
					
						
							
								
								
									Olivier Chafik 
								
							 
						 
						
							
							
							
							
								
							
							
								1c302e18ba 
								
							 
						 
						
							
							
								
								simpler hacky fixes for original broken template (+ fix minja example syntax polyfill)  
							
							
							
						 
						
							2025-02-03 20:34:44 +00:00 
							
								 
							
							
								 
							
						 
					 
				
					
						
							
								
								
									Olivier Chafik 
								
							 
						 
						
							
							
							
							
								
							
							
								c6214ee9d6 
								
							 
						 
						
							
							
								
								rm unneeded vocab  
							
							
							
						 
						
							2025-02-03 19:59:50 +00:00 
							
								 
							
							
								 
							
						 
					 
				
					
						
							
								
								
									Olivier Chafik 
								
							 
						 
						
							
							
							
							
								
							
							
								7dc271fb37 
								
							 
						 
						
							
							
								
								tool-calls: add deepseek r1 template + accommodate broken official template slightly better  
							
							
							
						 
						
							2025-02-03 19:59:33 +00:00 
							
								 
							
							
								 
							
						 
					 
				
					
						
							
								
								
									Olivier Chafik 
								
							 
						 
						
							
							
							
							
								
							
							
								0be7f652e9 
								
							 
						 
						
							
							
								
								Merge branch 'jinja-chatml' into r1-toolcall  
							
							
							
						 
						
							2025-02-03 19:35:54 +00:00 
							
								 
							
							
								 
							
						 
					 
				
					
						
							
								
								
									Olivier Chafik 
								
							 
						 
						
							
							
							
							
								
							
							
								d73448de1c 
								
							 
						 
						
							
							
								
								Simplify default chatml logic  
							
							
							
						 
						
							2025-02-03 19:22:53 +00:00 
							
								 
							
							
								 
							
						 
					 
				
					
						
							
								
								
									Olivier Chafik 
								
							 
						 
						
							
							
							
							
								
							
							
								569610ee77 
								
							 
						 
						
							
							
								
								tool-calls: accommodate variety of wrong tool call opening tags both Qwen 32B and 7B distills like to spit out  
							
							
							
						 
						
							2025-02-03 18:57:55 +00:00 
							
								 
							
							
								 
							
						 
					 
				
					
						
							
								
								
									Olivier Chafik 
								
							 
						 
						
							
							
							
							
								
							
							
								c397bd1f5f 
								
							 
						 
						
							
							
								
								tweak delta logic  
							
							
							
						 
						
							2025-02-03 17:57:38 +00:00 
							
								 
							
							
								 
							
						 
					 
				
					
						
							
								
								
									Olivier Chafik 
								
							 
						 
						
							
							
							
							
								
							
							
								df3474e2c2 
								
							 
						 
						
							
							
								
								tool-calls: r1: add missing <|tool▁calls▁end|> to grammar!  
							
							
							
						 
						
							2025-02-03 17:33:14 +00:00 
							
								 
							
							
								 
							
						 
					 
				
					
						
							
								
								
									Olivier Chafik 
								
							 
						 
						
							
							
							
							
								
							
							
								08271b5505 
								
							 
						 
						
							
							
								
								Merge branch 'jinja-chatml' into r1-toolcall  
							
							
							
						 
						
							2025-02-03 17:32:38 +00:00 
							
								 
							
							
								 
							
						 
					 
				
					
						
							
								
								
									Olivier Chafik 
								
							 
						 
						
							
							
							
							
								
							
							
								b2dd490926 
								
							 
						 
						
							
							
								
								add missing try catch around jinja parsing to default to chatml  
							
							
							
						 
						
							2025-02-03 17:32:12 +00:00 
							
								 
							
							
								 
							
						 
					 
				
					
						
							
								
								
									Olivier Chafik 
								
							 
						 
						
							
							
							
							
								
							
							
								4cb0e1d873 
								
							 
						 
						
							
							
								
								Merge branch 'jinja-chatml' into r1-toolcall  
							
							
							
						 
						
							2025-02-03 17:15:14 +00:00 
							
								 
							
							
								 
							
						 
					 
				
					
						
							
								
								
									Olivier Chafik 
								
							 
						 
						
							
							
							
							
								
							
							
								2b3c4829a3 
								
							 
						 
						
							
							
								
								fix build / rm diff  
							
							
							
						 
						
							2025-02-03 16:34:43 +00:00 
							
								 
							
							
								 
							
						 
					 
				
					
						
							
								
								
									Olivier Chafik 
								
							 
						 
						
							
							
							
							
								
							
							
								aa98e59038 
								
							 
						 
						
							
							
								
								fix bad merge  
							
							
							
						 
						
							2025-02-03 14:01:49 +00:00 
							
								 
							
							
								 
							
						 
					 
				
					
						
							
								
								
									Olivier Chafik 
								
							 
						 
						
							
							
							
							
								
							
							
								5d18d76b69 
								
							 
						 
						
							
							
								
								fix double bos issue (drop bos/eos tokens from jinja template)  
							
							
							
						 
						
							2025-02-03 13:59:16 +00:00 
							
								 
							
							
								 
							
						 
					 
				
					
						
							
								
								
									Olivier Chafik 
								
							 
						 
						
							
							
							
							
								
							
							
								cf83623a47 
								
							 
						 
						
							
							
								
								fix typo  
							
							
							
						 
						
							2025-02-03 13:58:46 +00:00 
							
								 
							
							
								 
							
						 
					 
				
					
						
							
								
								
									ochafik 
								
							 
						 
						
							
							
							
							
								
							
							
								a76073cf88 
								
							 
						 
						
							
							
								
								minimize diffs  
							
							
							
						 
						
							2025-02-03 10:58:52 +00:00 
							
								 
							
							
								 
							
						 
					 
				
					
						
							
								
								
									ochafik 
								
							 
						 
						
							
							
							
							
								
							
							
								1e9acd2d31 
								
							 
						 
						
							
							
								
								tool-call: allow --jinja --chat-template chatml  
							
							
							
						 
						
							2025-02-03 04:07:11 +00:00 
							
								 
							
							
								 
							
						 
					 
				
					
						
							
								
								
									ochafik 
								
							 
						 
						
							
							
							
							
								
							
							
								04be723b33 
								
							 
						 
						
							
							
								
								tool-call: fix command-r7b parsing when response is multiline  
							
							
							
						 
						
							2025-02-03 02:24:30 +00:00 
							
								 
							
							
								 
							
						 
					 
				
					
						
							
								
								
									ochafik 
								
							 
						 
						
							
							
							
							
								
							
							
								73d08d49cf 
								
							 
						 
						
							
							
								
								tool-call: allow --jinja --chat-template chatml  
							
							
							
						 
						
							2025-02-03 02:24:30 +00:00 
							
								 
							
							
								 
							
						 
					 
				
					
						
							
								
								
									ochafik 
								
							 
						 
						
							
							
							
							
								
							
							
								c80cb30938 
								
							 
						 
						
							
							
								
								update logs  
							
							
							
						 
						
							2025-02-03 02:24:30 +00:00 
							
								 
							
							
								 
							
						 
					 
				
					
						
							
								
								
									ochafik 
								
							 
						 
						
							
							
							
							
								
							
							
								04d511b5b5 
								
							 
						 
						
							
							
								
								Avoid double bos w/ jinja  
							
							
							
						 
						
							2025-02-03 02:24:30 +00:00 
							
								 
							
							
								 
							
						 
					 
				
					
						
							
								
								
									ochafik 
								
							 
						 
						
							
							
							
							
								
							
							
								130ca222c9 
								
							 
						 
						
							
							
								
								DeepSeek R1: parse thoughts / return in separate field in API (non streamed mode)  
							
							
							
						 
						
							2025-02-03 02:24:30 +00:00 
							
								 
							
							
								 
							
						 
					 
				
					
						
							
								
								
									ochafik 
								
							 
						 
						
							
							
							
							
								
							
							
								87de852b7f 
								
							 
						 
						
							
							
								
								pass vocab to common_chat_params_init  
							
							
							
						 
						
							2025-02-03 02:24:30 +00:00 
							
								 
							
							
								 
							
						 
					 
				
					
						
							
								
								
									ochafik 
								
							 
						 
						
							
							
							
							
								
							
							
								d3b60b8ad8 
								
							 
						 
						
							
							
								
								minja: enhance backfill of templates w/o tools description (use example tool call delta!)  
							
							
							
						 
						
							2025-02-03 01:03:04 +00:00 
							
								 
							
							
								 
							
						 
					 
				
					
						
							
								
								
									Eric Curtin 
								
							 
						 
						
							
							
								
								
							
							
							
								
							
							
								84ec8a58f7 
								
							 
						 
						
							
							
								
								Name colors ( #11573 )  
							
							... 
							
							
							
							It's more descriptive, use #define's so we can use compile-time
concatenations.
Signed-off-by: Eric Curtin <ecurtin@redhat.com> 
							
						 
						
							2025-02-02 15:14:48 +00:00 
							
								 
							
							
								 
							
						 
					 
				
					
						
							
								
								
									Olivier Chafik 
								
							 
						 
						
							
							
								
								
							
							
							
								
							
							
								bfcce4d693 
								
							 
						 
						
							
							
								
								tool-call: support Command R7B (+ return tool_plan "thoughts" in API) (#11585 )  
							
							... 
							
							
							
							* `tool-call`: support Command R7B (w/ tool_plan return)
* `tool-call`: cleaner preservation of tokens + warn when likely bad chat template override
* `tool-call`: test cleanup / handle lazy grammar triggers 
							
						 
						
							2025-02-02 09:25:38 +00:00 
							
								 
							
							
								 
							
						 
					 
				
					
						
							
								
								
									Olivier Chafik 
								
							 
						 
						
							
							
								
								
							
							
							
								
							
							
								69804487e0 
								
							 
						 
						
							
							
								
								Fix exotic ci env that lacks ostringstream::str ( #11581 )  
							
							
							
						 
						
							2025-02-02 09:10:15 +00:00 
							
								 
							
							
								 
							
						 
					 
				
					
						
							
								
								
									Michał Moskal 
								
							 
						 
						
							
							
								
								
							
							
							
								
							
							
								ff227703d6 
								
							 
						 
						
							
							
								
								sampling : support for llguidance grammars ( #10224 )  
							
							... 
							
							
							
							* initial porting of previous LLG patch
* update for new APIs
* build: integrate llguidance as an external project
* use '%llguidance' as marker to enable llg lark syntax
* add some docs
* clarify docs
* code style fixes
* remove llguidance.h from .gitignore
* fix tests when llg is enabled
* pass vocab not model to llama_sampler_init_llg()
* copy test-grammar-integration.cpp to test-llguidance.cpp
* clang fmt
* fix ref-count bug
* build and run test
* gbnf -> lark syntax
* conditionally include llguidance test based on LLAMA_LLGUIDANCE flag
* rename llguidance test file to test-grammar-llguidance.cpp
* add gh action for llg test
* align tests with LLG grammar syntax and JSON Schema spec
* llama_tokenizer() in fact requires valid utf8
* update llg
* format file
* add $LLGUIDANCE_LOG_LEVEL support
* fix whitespace
* fix warning
* include <cmath> for INFINITY
* add final newline
* fail llama_sampler_init_llg() at runtime
* Link gbnf_to_lark.py script; fix links; refer to llg docs for lexemes
* simplify #includes
* improve doc string for LLAMA_LLGUIDANCE
* typo in merge
* bump llguidance to 0.6.12 
							
						 
						
							2025-02-02 09:55:32 +02:00 
							
								 
							
							
								 
							
						 
					 
				
					
						
							
								
								
									Olivier Chafik 
								
							 
						 
						
							
							
								
								
							
							
							
								
							
							
								cfd74c86db 
								
							 
						 
						
							
							
								
								sync: minja (418a2364b5) ( #11574 )  
							
							
							
						 
						
							2025-02-01 12:24:51 +00:00 
							
								 
							
							
								 
							
						 
					 
				
					
						
							
								
								
									Olivier Chafik 
								
							 
						 
						
							
							
								
								
							
							
							
								
							
							
								a83f528688 
								
							 
						 
						
							
							
								
								tool-call: fix llama 3.x and functionary 3.2, play nice w/ pydantic_ai package, update readme (#11539 )  
							
							... 
							
							
							
							* An empty tool_call_id is better than none!
* sync: minja (tool call name optional https://github.com/google/minja/pull/36 )
* Force-disable parallel_tool_calls if template doesn't support it
* More debug logs
* Llama 3.x tools: accept / trigger on more varied spaced outputs
* Fix empty content for functionary v3.2 tool call
* Add proper tool call docs to server README
* readme: function calling *is* supported now
* Apply suggestions from code review
Co-authored-by: Georgi Gerganov <ggerganov@gmail.com>
---------
Co-authored-by: Georgi Gerganov <ggerganov@gmail.com> 
							
						 
						
							2025-01-31 14:15:25 +00:00 
							
								 
							
							
								 
							
						 
					 
				
					
						
							
								
								
									Steve Grubb 
								
							 
						 
						
							
							
								
								
							
							
							
								
							
							
								1bd3047a93 
								
							 
						 
						
							
							
								
								common: Add missing va_end ( #11529 )  
							
							... 
							
							
							
							The va_copy man page states that va_end must be called to revert
whatever the copy did. For some implementaions, not calling va_end
has no consequences. For others it could leak memory. 
							
						 
						
							2025-01-31 07:58:55 +02:00 
							
								 
							
							
								 
							
						 
					 
				
					
						
							
								
								
									Olivier Chafik 
								
							 
						 
						
							
							
								
								
							
							
							
								
							
							
								8b576b6c55 
								
							 
						 
						
							
							
								
								Tool call support (generic + native for Llama, Functionary, Hermes, Mistral, Firefunction, DeepSeek) w/ lazy grammars ( #9639 )  
							
							... 
							
							
							
							---------
Co-authored-by: Xuan Son Nguyen <thichthat@gmail.com>
Co-authored-by: Georgi Gerganov <ggerganov@gmail.com>
Co-authored-by: Xuan Son Nguyen <son@huggingface.co> 
							
						 
						
							2025-01-30 19:13:58 +00:00 
							
								 
							
							
								 
							
						 
					 
				
					
						
							
								
								
									Olivier Chafik 
								
							 
						 
						
							
							
								
								
							
							
							
								
							
							
								3d804dec76 
								
							 
						 
						
							
							
								
								sync: minja ( #11499 )  
							
							
							
						 
						
							2025-01-30 10:30:27 +00:00 
							
								 
							
							
								 
							
						 
					 
				
					
						
							
								
								
									Daniel Bevenius 
								
							 
						 
						
							
							
								
								
							
							
							
								
							
							
								b636228c0a 
								
							 
						 
						
							
							
								
								embedding : enable --no-warmup option ( #11475 )  
							
							... 
							
							
							
							This commit enables the `--no-warmup` option for the llama-embeddings.
The motivation for this change is to allow the user to disable the
warmup when running the the program. 
							
						 
						
							2025-01-29 10:38:54 +02:00 
							
								 
							
							
								 
							
						 
					 
				
					
						
							
								
								
									Olivier Chafik 
								
							 
						 
						
							
							
								
								
							
							
							
								
							
							
								c64d2becb1 
								
							 
						 
						
							
							
								
								minja: sync at 0f5f7f2b37 ( #11352 )  
							
							
							
						 
						
							2025-01-22 16:16:27 +00:00 
							
								 
							
							
								 
							
						 
					 
				
					
						
							
								
								
									Olivier Chafik 
								
							 
						 
						
							
							
								
								
							
							
							
								
							
							
								a94f3b2727 
								
							 
						 
						
							
							
								
								common: utils to split / join / repeat strings (from json converter) (#11342 )  
							
							... 
							
							
							
							* Factor string_join, string_split, string_repeat into common
* json: refactor to surface a versatile builder
* Update common.cpp 
							
						 
						
							2025-01-22 09:51:44 +00:00 
							
								 
							
							
								 
							
						 
					 
				
					
						
							
								
								
									Olivier Chafik 
								
							 
						 
						
							
							
								
								
							
							
							
								
							
							
								6171c9d258 
								
							 
						 
						
							
							
								
								Add Jinja template support ( #11016 )  
							
							... 
							
							
							
							* Copy minja from 58f0ca6dd7https://github.com/google/minja/pull/22 )
* Apply suggestions from code review
Co-authored-by: Xuan Son Nguyen <thichthat@gmail.com>
Co-authored-by: Georgi Gerganov <ggerganov@gmail.com>
* Finish suggested renamings
* Move chat_templates inside server_context + remove mutex
* Update --chat-template-file w/ recent change to --chat-template
* Refactor chat template validation
* Guard against missing eos/bos tokens (null token otherwise throws in llama_vocab::impl::token_get_attr)
* Warn against missing eos / bos tokens when jinja template references them
* rename: common_chat_template[s]
* reinstate assert on chat_templates.template_default
* Update minja to b8437df626https://github.com/google/minja/pull/25 
* Update minja from https://github.com/google/minja/pull/27 
* rm unused optional header
---------
Co-authored-by: Xuan Son Nguyen <thichthat@gmail.com>
Co-authored-by: Georgi Gerganov <ggerganov@gmail.com> 
							
						 
						
							2025-01-21 13:18:51 +00:00 
							
								 
							
							
								 
							
						 
					 
				
					
						
							
								
								
									Georgi Gerganov 
								
							 
						 
						
							
							
								
								
							
							
							
								
							
							
								80d0d6b4b7 
								
							 
						 
						
							
							
								
								common : add -hfd option for the draft model ( #11318 )  
							
							... 
							
							
							
							* common : add -hfd option for the draft model
* cont : fix env var
* cont : more fixes 
							
						 
						
							2025-01-20 22:29:43 +02:00 
							
								 
							
							
								 
							
						 
					 
				
					
						
							
								
								
									LostRuins Concedo 
								
							 
						 
						
							
							
								
								
							
							
							
								
							
							
								6390a998bf 
								
							 
						 
						
							
							
								
								tts : add guide tokens support ( #11186 )  
							
							... 
							
							
							
							* Added the ability to use guide tokens for OuteTTS, greatly improving TTS recitation accuracy over long input sequences.
* applied linting suggestions, updated to latest llama_vocab changes, added a safety check, added newline to guide token start 
							
						 
						
							2025-01-18 12:20:57 +02:00 
							
								 
							
							
								 
							
						 
					 
				
					
						
							
								
								
									Radoslav Gerganov 
								
							 
						 
						
							
							
								
								
							
							
							
								
							
							
								667d72846c 
								
							 
						 
						
							
							
								
								rpc : early register backend devices ( #11262 )  
							
							... 
							
							
							
							Early register RPC devices and do not propagate RPC specifics in the
llama model structures.
ref: #10609  
							
						 
						
							2025-01-17 10:57:09 +02:00 
							
								 
							
							
								 
							
						 
					 
				
					
						
							
								
								
									Xuan Son Nguyen 
								
							 
						 
						
							
							
								
								
							
							
							
								
							
							
								84a44815f7 
								
							 
						 
						
							
							
								
								cli : auto activate conversation mode if chat template is available ( #11214 )  
							
							... 
							
							
							
							* cli : auto activate conversation mode if chat template is detected
* add warn on bad template
* update readme (writing with the help of chatgpt)
* update readme (2)
* do not activate -cnv for non-instruct models 
							
						 
						
							2025-01-13 20:18:12 +01:00 
							
								 
							
							
								 
							
						 
					 
				
					
						
							
								
								
									Xuan Son Nguyen 
								
							 
						 
						
							
							
								
								
							
							
							
								
							
							
								00b4c3da62 
								
							 
						 
						
							
							
								
								common : support tag-based --hf-repo like on ollama ( #11195 )  
							
							... 
							
							
							
							* common : support tag-based hf_repo like on ollama
* fix build
* various fixes
* small fixes
* fix style
* fix windows build?
* move common_get_hf_file to common.cpp
* fix complain with noreturn 
							
						 
						
							2025-01-13 13:56:23 +01:00 
							
								 
							
							
								 
							
						 
					 
				
					
						
							
								
								
									Xuan Son Nguyen 
								
							 
						 
						
							
							
								
								
							
							
							
								
							
							
								9a483999a6 
								
							 
						 
						
							
							
								
								llama : fix chat template gguf key ( #11201 )  
							
							
							
						 
						
							2025-01-12 13:45:14 +01:00 
							
								 
							
							
								 
							
						 
					 
				
					
						
							
								
								
									Georgi Gerganov 
								
							 
						 
						
							
							
								
								
							
							
							
								
							
							
								afa8a9ec9b 
								
							 
						 
						
							
							
								
								llama : add llama_vocab, functions -> methods, naming ( #11110 )  
							
							... 
							
							
							
							* llama : functions -> methods (#11110 )
* llama : add struct llama_vocab to the API (#11156 )
ggml-ci
* hparams : move vocab params to llama_vocab (#11159 )
ggml-ci
* vocab : more pimpl (#11165 )
ggml-ci
* vocab : minor tokenization optimizations (#11160 )
ggml-ci
Co-authored-by: Diego Devesa <slarengh@gmail.com>
* lora : update API names (#11167 )
ggml-ci
* llama : update API names to use correct prefix (#11174 )
* llama : update API names to use correct prefix
ggml-ci
* cont
ggml-ci
* cont
ggml-ci
* minor [no ci]
* vocab : llama_vocab_add_[be]os -> llama_vocab_get_add_[be]os (#11174 )
ggml-ci
* vocab : llama_vocab_n_vocab -> llama_vocab_n_tokens (#11174 )
ggml-ci
---------
Co-authored-by: Diego Devesa <slarengh@gmail.com> 
							
						 
						
							2025-01-12 11:32:42 +02:00