Olivier Chafik 
								
							 
						 
						
							
							
							
							
								
							
							
								bff549deb6 
								
							 
						 
						
							
							
								
								simplify hack to fix original template's backfill from minja  
							
							
							
						 
						
							2025-02-04 00:14:48 +00:00 
							
								 
							
						 
					 
				
					
						
							
								
								
									Olivier Chafik 
								
							 
						 
						
							
							
							
							
								
							
							
								bbd45bf6a2 
								
							 
						 
						
							
							
								
								sync: minja  
							
							
							
						 
						
							2025-02-04 00:14:15 +00:00 
							
								 
							
						 
					 
				
					
						
							
								
								
									Olivier Chafik 
								
							 
						 
						
							
							
							
							
								
							
							
								30ea3591c9 
								
							 
						 
						
							
							
								
								update to minja's new api  
							
							
							
						 
						
							2025-02-03 23:53:27 +00:00 
							
								 
							
						 
					 
				
					
						
							
								
								
									Olivier Chafik 
								
							 
						 
						
							
							
							
							
								
							
							
								11c1f0c7d4 
								
							 
						 
						
							
							
								
								actually we want eos_token in the template to infer tool call examples, explicitly skipped in new template options  
							
							
							
						 
						
							2025-02-03 23:52:28 +00:00 
							
								 
							
						 
					 
				
					
						
							
								
								
									Olivier Chafik 
								
							 
						 
						
							
							
							
							
								
							
							
								bc6d910f6d 
								
							 
						 
						
							
							
								
								Merge branch 'master' into r1-toolcall  
							
							
							
						 
						
							2025-02-03 23:51:31 +00:00 
							
								 
							
						 
					 
				
					
						
							
								
								
									Olivier Chafik 
								
							 
						 
						
							
							
								
								
							
							
							
								
							
							
								cde3833239 
								
							 
						 
						
							
							
								
								tool-call: allow --chat-template chatml w/ --jinja, default to chatml upon parsing issue, avoid double bos (#11616 )  
							
							... 
							
							
							
							* tool-call: allow `--jinja --chat-template chatml`
* fix double bos issue (drop bos/eos tokens from jinja template)
* add missing try catch around jinja parsing to default to chatml
* Simplify default chatml logic 
							
						 
						
							2025-02-03 23:49:27 +00:00 
							
								 
							
						 
					 
				
					
						
							
								
								
									Olivier Chafik 
								
							 
						 
						
							
							
							
							
								
							
							
								108da907f0 
								
							 
						 
						
							
							
								
								sync: minja  https://github.com/google/minja/pull/46  
							
							
							
						 
						
							2025-02-03 23:31:49 +00:00 
							
								 
							
						 
					 
				
					
						
							
								
								
									Xuan-Son Nguyen 
								
							 
						 
						
							
							
								
								
							
							
							
								
							
							
								b3451785ac 
								
							 
						 
						
							
							
								
								server : (webui) revert hacky solution from  #11626  ( #11634 )  
							
							
							
						 
						
							2025-02-04 00:10:52 +01:00 
							
								 
							
						 
					 
				
					
						
							
								
								
									Woof Dog 
								
							 
						 
						
							
							
								
								
							
							
							
								
							
							
								1d1e6a90bc 
								
							 
						 
						
							
							
								
								server : (webui) allow typing and submitting during llm response ( #11626 )  
							
							
							
						 
						
							2025-02-03 23:16:27 +01:00 
							
								 
							
						 
					 
				
					
						
							
								
								
									Olivier Chafik 
								
							 
						 
						
							
							
							
							
								
							
							
								1c302e18ba 
								
							 
						 
						
							
							
								
								simpler hacky fixes for original broken template (+ fix minja example syntax polyfill)  
							
							
							
						 
						
							2025-02-03 20:34:44 +00:00 
							
								 
							
						 
					 
				
					
						
							
								
								
									Olivier Chafik 
								
							 
						 
						
							
							
							
							
								
							
							
								c6214ee9d6 
								
							 
						 
						
							
							
								
								rm unneeded vocab  
							
							
							
						 
						
							2025-02-03 19:59:50 +00:00 
							
								 
							
						 
					 
				
					
						
							
								
								
									Olivier Chafik 
								
							 
						 
						
							
							
							
							
								
							
							
								7dc271fb37 
								
							 
						 
						
							
							
								
								tool-calls: add deepseek r1 template + accommodate broken official template slightly better  
							
							
							
						 
						
							2025-02-03 19:59:33 +00:00 
							
								 
							
						 
					 
				
					
						
							
								
								
									Olivier Chafik 
								
							 
						 
						
							
							
							
							
								
							
							
								0be7f652e9 
								
							 
						 
						
							
							
								
								Merge branch 'jinja-chatml' into r1-toolcall  
							
							
							
						 
						
							2025-02-03 19:35:54 +00:00 
							
								 
							
						 
					 
				
					
						
							
								
								
									Olivier Chafik 
								
							 
						 
						
							
							
							
							
								
							
							
								d73448de1c 
								
							 
						 
						
							
							
								
								Simplify default chatml logic  
							
							
							
						 
						
							2025-02-03 19:22:53 +00:00 
							
								 
							
						 
					 
				
					
						
							
								
								
									Olivier Chafik 
								
							 
						 
						
							
							
							
							
								
							
							
								569610ee77 
								
							 
						 
						
							
							
								
								tool-calls: accommodate variety of wrong tool call opening tags both Qwen 32B and 7B distills like to spit out  
							
							
							
						 
						
							2025-02-03 18:57:55 +00:00 
							
								 
							
						 
					 
				
					
						
							
								
								
									Olivier Chafik 
								
							 
						 
						
							
							
							
							
								
							
							
								c397bd1f5f 
								
							 
						 
						
							
							
								
								tweak delta logic  
							
							
							
						 
						
							2025-02-03 17:57:38 +00:00 
							
								 
							
						 
					 
				
					
						
							
								
								
									Olivier Chafik 
								
							 
						 
						
							
							
							
							
								
							
							
								df3474e2c2 
								
							 
						 
						
							
							
								
								tool-calls: r1: add missing <|tool▁calls▁end|> to grammar!  
							
							
							
						 
						
							2025-02-03 17:33:14 +00:00 
							
								 
							
						 
					 
				
					
						
							
								
								
									Olivier Chafik 
								
							 
						 
						
							
							
							
							
								
							
							
								08271b5505 
								
							 
						 
						
							
							
								
								Merge branch 'jinja-chatml' into r1-toolcall  
							
							
							
						 
						
							2025-02-03 17:32:38 +00:00 
							
								 
							
						 
					 
				
					
						
							
								
								
									Olivier Chafik 
								
							 
						 
						
							
							
							
							
								
							
							
								b2dd490926 
								
							 
						 
						
							
							
								
								add missing try catch around jinja parsing to default to chatml  
							
							
							
						 
						
							2025-02-03 17:32:12 +00:00 
							
								 
							
						 
					 
				
					
						
							
								
								
									Olivier Chafik 
								
							 
						 
						
							
							
							
							
								
							
							
								4cb0e1d873 
								
							 
						 
						
							
							
								
								Merge branch 'jinja-chatml' into r1-toolcall  
							
							
							
						 
						
							2025-02-03 17:15:14 +00:00 
							
								 
							
						 
					 
				
					
						
							
								
								
									Olivier Chafik 
								
							 
						 
						
							
							
							
							
								
							
							
								2b3c4829a3 
								
							 
						 
						
							
							
								
								fix build / rm diff  
							
							
							
						 
						
							2025-02-03 16:34:43 +00:00 
							
								 
							
						 
					 
				
					
						
							
								
								
									Daniel Bevenius 
								
							 
						 
						
							
							
								
								
							
							
							
								
							
							
								5598f475be 
								
							 
						 
						
							
							
								
								server : remove CPPHTTPLIB_NO_EXCEPTIONS define ( #11622 )  
							
							... 
							
							
							
							This commit removes the CPPHTTPLIB_NO_EXCEPTIONS define from the server
code.
The motivation for this is that when using a debug build the server
would crash when an exception was throws and terminate the server
process, as it was unhandled. When CPPHTTPLIB_NO_EXCEPTIONS is set
cpp_httplib will not call the exception handler, which would normally
return a 500 error to the client. This caused tests to fail when using
a debug build.
Fixes: https://github.com/ggerganov/llama.cpp/issues/11613  
							
						 
						
							2025-02-03 16:45:38 +01:00 
							
								 
							
						 
					 
				
					
						
							
								
								
									Olivier Chafik 
								
							 
						 
						
							
							
							
							
								
							
							
								aa98e59038 
								
							 
						 
						
							
							
								
								fix bad merge  
							
							
							
						 
						
							2025-02-03 14:01:49 +00:00 
							
								 
							
						 
					 
				
					
						
							
								
								
									Olivier Chafik 
								
							 
						 
						
							
							
							
							
								
							
							
								5d18d76b69 
								
							 
						 
						
							
							
								
								fix double bos issue (drop bos/eos tokens from jinja template)  
							
							
							
						 
						
							2025-02-03 13:59:16 +00:00 
							
								 
							
						 
					 
				
					
						
							
								
								
									Olivier Chafik 
								
							 
						 
						
							
							
							
							
								
							
							
								cf83623a47 
								
							 
						 
						
							
							
								
								fix typo  
							
							
							
						 
						
							2025-02-03 13:58:46 +00:00 
							
								 
							
						 
					 
				
					
						
							
								
								
									Georgi Gerganov 
								
							 
						 
						
							
							
								
								
							
							
							
								
							
							
								8ec05832fa 
								
							 
						 
						
							
							
								
								sync : ggml  
							
							
							
						 
						
							2025-02-03 14:57:08 +02:00 
							
								 
							
						 
					 
				
					
						
							
								
								
									Johannes Gäßler 
								
							 
						 
						
							
							
								
								
							
							
							
								
							
							
								21c84b5d2d 
								
							 
						 
						
							
							
								
								CUDA: fix Volta FlashAttention logic ( #11615 )  
							
							
							
						 
						
							2025-02-03 14:25:56 +02:00 
							
								 
							
						 
					 
				
					
						
							
								
								
									ochafik 
								
							 
						 
						
							
							
							
							
								
							
							
								a76073cf88 
								
							 
						 
						
							
							
								
								minimize diffs  
							
							
							
						 
						
							2025-02-03 10:58:52 +00:00 
							
								 
							
						 
					 
				
					
						
							
								
								
									ochafik 
								
							 
						 
						
							
							
							
							
								
							
							
								77ae97e7d6 
								
							 
						 
						
							
							
								
								Update test_tool_call.py  
							
							
							
						 
						
							2025-02-03 10:28:30 +00:00 
							
								 
							
						 
					 
				
					
						
							
								
								
									mashdragon 
								
							 
						 
						
							
							
								
								
							
							
							
								
							
							
								d92cb67e37 
								
							 
						 
						
							
							
								
								server : (webui) Fix Shift+Enter handling ( #11609 )  
							
							... 
							
							
							
							* Fix Shift+Enter handling
`exact` on the Enter handler means the message is not sent when Shift+Enter is pressed anyway
* build index.html.gz
---------
Co-authored-by: Xuan Son Nguyen <son@huggingface.co> 
							
						 
						
							2025-02-03 10:42:55 +01:00 
							
								 
							
						 
					 
				
					
						
							
								
								
									ochafik 
								
							 
						 
						
							
							
							
							
								
							
							
								1e9acd2d31 
								
							 
						 
						
							
							
								
								tool-call: allow --jinja --chat-template chatml  
							
							
							
						 
						
							2025-02-03 04:07:11 +00:00 
							
								 
							
						 
					 
				
					
						
							
								
								
									ochafik 
								
							 
						 
						
							
							
							
							
								
							
							
								5e6f2a21ae 
								
							 
						 
						
							
							
								
								add deepseek models to server tool call section in readme  
							
							
							
						 
						
							2025-02-03 02:44:42 +00:00 
							
								 
							
						 
					 
				
					
						
							
								
								
									ochafik 
								
							 
						 
						
							
							
							
							
								
							
							
								19bea4ecc3 
								
							 
						 
						
							
							
								
								tell DS R1 not to overthink (weather test)  
							
							
							
						 
						
							2025-02-03 02:24:30 +00:00 
							
								 
							
						 
					 
				
					
						
							
								
								
									ochafik 
								
							 
						 
						
							
							
							
							
								
							
							
								ae9d5812a7 
								
							 
						 
						
							
							
								
								tool-calls: add DeepSeek R1 Qwen 7B to server test_hello_world  
							
							
							
						 
						
							2025-02-03 02:24:30 +00:00 
							
								 
							
						 
					 
				
					
						
							
								
								
									ochafik 
								
							 
						 
						
							
							
							
							
								
							
							
								04be723b33 
								
							 
						 
						
							
							
								
								tool-call: fix command-r7b parsing when response is multiline  
							
							
							
						 
						
							2025-02-03 02:24:30 +00:00 
							
								 
							
						 
					 
				
					
						
							
								
								
									ochafik 
								
							 
						 
						
							
							
							
							
								
							
							
								73d08d49cf 
								
							 
						 
						
							
							
								
								tool-call: allow --jinja --chat-template chatml  
							
							
							
						 
						
							2025-02-03 02:24:30 +00:00 
							
								 
							
						 
					 
				
					
						
							
								
								
									ochafik 
								
							 
						 
						
							
							
							
							
								
							
							
								08716281f2 
								
							 
						 
						
							
							
								
								rename tests  
							
							
							
						 
						
							2025-02-03 02:24:30 +00:00 
							
								 
							
						 
					 
				
					
						
							
								
								
									ochafik 
								
							 
						 
						
							
							
							
							
								
							
							
								c80cb30938 
								
							 
						 
						
							
							
								
								update logs  
							
							
							
						 
						
							2025-02-03 02:24:30 +00:00 
							
								 
							
						 
					 
				
					
						
							
								
								
									ochafik 
								
							 
						 
						
							
							
							
							
								
							
							
								28345877e4 
								
							 
						 
						
							
							
								
								server/oai: ensure content is null when there are tool calls  
							
							
							
						 
						
							2025-02-03 02:24:30 +00:00 
							
								 
							
						 
					 
				
					
						
							
								
								
									ochafik 
								
							 
						 
						
							
							
							
							
								
							
							
								04d511b5b5 
								
							 
						 
						
							
							
								
								Avoid double bos w/ jinja  
							
							
							
						 
						
							2025-02-03 02:24:30 +00:00 
							
								 
							
						 
					 
				
					
						
							
								
								
									ochafik 
								
							 
						 
						
							
							
							
							
								
							
							
								130ca222c9 
								
							 
						 
						
							
							
								
								DeepSeek R1: parse thoughts / return in separate field in API (non streamed mode)  
							
							
							
						 
						
							2025-02-03 02:24:30 +00:00 
							
								 
							
						 
					 
				
					
						
							
								
								
									ochafik 
								
							 
						 
						
							
							
							
							
								
							
							
								87de852b7f 
								
							 
						 
						
							
							
								
								pass vocab to common_chat_params_init  
							
							
							
						 
						
							2025-02-03 02:24:30 +00:00 
							
								 
							
						 
					 
				
					
						
							
								
								
									ochafik 
								
							 
						 
						
							
							
							
							
								
							
							
								d3b60b8ad8 
								
							 
						 
						
							
							
								
								minja: enhance backfill of templates w/o tools description (use example tool call delta!)  
							
							
							
						 
						
							2025-02-03 01:03:04 +00:00 
							
								 
							
						 
					 
				
					
						
							
								
								
									Johannes Gäßler 
								
							 
						 
						
							
							
								
								
							
							
							
								
							
							
								6eecde3cc8 
								
							 
						 
						
							
							
								
								HIP: fix flash_attn_stream_k_fixup warning ( #11604 )  
							
							
							
						 
						
							2025-02-02 23:48:29 +01:00 
							
								 
							
						 
					 
				
					
						
							
								
								
									uvos 
								
							 
						 
						
							
							
								
								
							
							
							
								
							
							
								396856b400 
								
							 
						 
						
							
							
								
								CUDA/HIP: add support for selectable warp size to mmv ( #11519 )  
							
							... 
							
							
							
							CUDA/HIP: add support for selectable warp size to mmv 
							
						 
						
							2025-02-02 22:40:09 +01:00 
							
								 
							
						 
					 
				
					
						
							
								
								
									uvos 
								
							 
						 
						
							
							
								
								
							
							
							
								
							
							
								4d0598e144 
								
							 
						 
						
							
							
								
								HIP: add GGML_CUDA_CC_IS_* for amd familys as increasing cc archtectures for amd gpus are not supersets of eatch other ( #11601 )  
							
							... 
							
							
							
							This fixes a bug where RDNA1 gpus other than gfx1010 where not handled correctly 
							
						 
						
							2025-02-02 22:08:05 +01:00 
							
								 
							
						 
					 
				
					
						
							
								
								
									Olivier Chafik 
								
							 
						 
						
							
							
								
								
							
							
							
								
							
							
								90f9b88afb 
								
							 
						 
						
							
							
								
								nit: more informative crash when grammar sampler fails ( #11593 )  
							
							
							
						 
						
							2025-02-02 19:58:34 +00:00 
							
								 
							
						 
					 
				
					
						
							
								
								
									Johannes Gäßler 
								
							 
						 
						
							
							
								
								
							
							
							
								
							
							
								864a0b67a6 
								
							 
						 
						
							
							
								
								CUDA: use mma PTX instructions for FlashAttention ( #11583 )  
							
							... 
							
							
							
							* CUDA: use mma PTX instructions for FlashAttention
* __shfl_sync workaround for movmatrix
* add __shfl_sync to HIP
Co-authored-by: Diego Devesa <slarengh@gmail.com> 
							
						 
						
							2025-02-02 19:31:09 +01:00 
							
								 
							
						 
					 
				
					
						
							
								
								
									Eric Curtin 
								
							 
						 
						
							
							
								
								
							
							
							
								
							
							
								84ec8a58f7 
								
							 
						 
						
							
							
								
								Name colors ( #11573 )  
							
							... 
							
							
							
							It's more descriptive, use #define's so we can use compile-time
concatenations.
Signed-off-by: Eric Curtin <ecurtin@redhat.com> 
							
						 
						
							2025-02-02 15:14:48 +00:00 
							
								 
							
						 
					 
				
					
						
							
								
								
									Olivier Chafik 
								
							 
						 
						
							
							
								
								
							
							
							
								
							
							
								bfcce4d693 
								
							 
						 
						
							
							
								
								tool-call: support Command R7B (+ return tool_plan "thoughts" in API) (#11585 )  
							
							... 
							
							
							
							* `tool-call`: support Command R7B (w/ tool_plan return)
* `tool-call`: cleaner preservation of tokens + warn when likely bad chat template override
* `tool-call`: test cleanup / handle lazy grammar triggers 
							
						 
						
							2025-02-02 09:25:38 +00:00