Commit graph

4844 commits

Author SHA1 Message Date
Olivier Chafik
01b345be0f Merge remote-tracking branch 'origin/master' into tool-call 2025-01-22 10:02:23 +00:00
Olivier Chafik
a94f3b2727
common: utils to split / join / repeat strings (from json converter) (#11342)
* Factor string_join, string_split, string_repeat into common

* json: refactor to surface a versatile builder

* Update common.cpp
2025-01-22 09:51:44 +00:00
tc-mb
3e3357fd77
llava : support Minicpm-omni (#11289)
* init

* add readme

* update readme

* no use make

* update readme

* update fix code

* fix editorconfig-checker

* no change convert py

* use clip_image_u8_free
2025-01-22 09:35:48 +02:00
Olivier Chafik
2dd09c792f more cleanups 2025-01-22 03:20:47 +00:00
Olivier Chafik
28cac497a6 drop llama_sampler_accept_str 2025-01-22 02:38:04 +00:00
Olivier Chafik
e211629b89 Merge branch 'string_utils' into tool-call 2025-01-22 02:27:10 +00:00
Olivier Chafik
5140d7a00b Update common.cpp 2025-01-22 02:25:09 +00:00
Olivier Chafik
41a613bbd3 Merge branch 'string_utils' into tool-call 2025-01-22 02:22:20 +00:00
Olivier Chafik
03fe80f1bb drop unused fs_list_files 2025-01-22 02:22:03 +00:00
Olivier Chafik
4de5cf8a10 json: refactor to surface a versatile builder 2025-01-22 02:19:23 +00:00
Olivier Chafik
9a5acbb4a3 Factor string_join, string_split, string_repeat into common 2025-01-22 02:17:34 +00:00
Olivier Chafik
9e8b43f993 follow enum naming style for tool call styles 2025-01-22 02:13:02 +00:00
Olivier Chafik
5268ec8947 Refactor string helpers into common 2025-01-22 02:08:18 +00:00
Olivier Chafik
d77fecc3dc shrink diff in json conversion code 2025-01-22 01:54:17 +00:00
Olivier Chafik
3972945798 common_tool_call rename 2025-01-22 01:54:08 +00:00
Olivier Chafik
ef61a4c79e minimize diffs 2025-01-22 01:46:51 +00:00
Olivier Chafik
dbf841b0d2 Push laziness down to grammar impl 2025-01-22 01:25:54 +00:00
Olivier Chafik
77f4098c83 Delete update_jinja_goldens.py 2025-01-21 14:41:59 +00:00
Olivier Chafik
f6e73dac43 Remove examples/agent (moved to https://gist.github.com/ochafik/9246d289b7d38d49e1ee2755698d6c79) 2025-01-21 14:41:56 +00:00
Olivier Chafik
b49d0521e9 rm tests/test-minja from makefile 2025-01-21 14:12:38 +00:00
Olivier Chafik
fec0260366 Merge remote-tracking branch 'origin/master' into tool-call 2025-01-21 13:44:58 +00:00
Olivier Chafik
6171c9d258
Add Jinja template support (#11016)
* Copy minja from 58f0ca6dd7

* Add --jinja and --chat-template-file flags

* Add missing <optional> include

* Avoid print in get_hf_chat_template.py

* No designated initializers yet

* Try and work around msvc++ non-macro max resolution quirk

* Update test_chat_completion.py

* Wire LLM_KV_TOKENIZER_CHAT_TEMPLATE_N in llama_model_chat_template

* Refactor test-chat-template

* Test templates w/ minja

* Fix deprecation

* Add --jinja to llama-run

* Update common_chat_format_example to use minja template wrapper

* Test chat_template in e2e test

* Update utils.py

* Update test_chat_completion.py

* Update run.cpp

* Update arg.cpp

* Refactor common_chat_* functions to accept minja template + use_jinja option

* Attempt to fix linkage of LLAMA_CHATML_TEMPLATE

* Revert LLAMA_CHATML_TEMPLATE refactor

* Normalize newlines in test-chat-templates for windows tests

* Forward decl minja::chat_template to avoid eager json dep

* Flush stdout in chat template before potential crash

* Fix copy elision warning

* Rm unused optional include

* Add missing optional include to server.cpp

* Disable jinja test that has a cryptic windows failure

* minja: fix vigogne (https://github.com/google/minja/pull/22)

* Apply suggestions from code review

Co-authored-by: Xuan Son Nguyen <thichthat@gmail.com>
Co-authored-by: Georgi Gerganov <ggerganov@gmail.com>

* Finish suggested renamings

* Move chat_templates inside server_context + remove mutex

* Update --chat-template-file w/ recent change to --chat-template

* Refactor chat template validation

* Guard against missing eos/bos tokens (null token otherwise throws in llama_vocab::impl::token_get_attr)

* Warn against missing eos / bos tokens when jinja template references them

* rename: common_chat_template[s]

* reinstate assert on chat_templates.template_default

* Update minja to b8437df626

* Update minja to https://github.com/google/minja/pull/25

* Update minja from https://github.com/google/minja/pull/27

* rm unused optional header

---------

Co-authored-by: Xuan Son Nguyen <thichthat@gmail.com>
Co-authored-by: Georgi Gerganov <ggerganov@gmail.com>
2025-01-21 13:18:51 +00:00
Xuan Son Nguyen
e28245f35f
export-lora : fix tok_embd tensor (#11330) 2025-01-21 14:07:12 +01:00
Radoslav Gerganov
6da5bec81c
rpc : better caching of the base buffer pointer (#11331)
There is no need to use map, just store the base pointer in the buffer
context.
2025-01-21 15:06:41 +02:00
Eric Curtin
2e2f8f093c
linenoise.cpp refactoring (#11301)
More RAII mainly

Signed-off-by: Eric Curtin <ecurtin@redhat.com>
2025-01-21 09:32:35 +00:00
Georgi Gerganov
2139667ec4
metal : fix out-of-bounds write (#11314)
ggml-ci
2025-01-21 08:48:13 +02:00
ochafik
c606255948 Merge branch 'jinja' into tool-call 2025-01-21 03:49:30 +00:00
ochafik
9d8ebd62c6 Update minja from https://github.com/google/minja/pull/27 2025-01-21 03:18:06 +00:00
ochafik
ba8dd66fdf Merge branch 'jinja' into tool-call 2025-01-21 01:43:14 +00:00
ochafik
ff2cce57ad Update minja to https://github.com/google/minja/pull/25 2025-01-21 01:26:19 +00:00
ochafik
56aa93c266 fix std imports for gcc build 2025-01-21 00:08:22 +00:00
ochafik
7ea6a06cde Merge branch 'jinja' into tool-call 2025-01-20 23:59:24 +00:00
ochafik
8347da907d Update minja to b8437df626 2025-01-20 23:59:15 +00:00
ochafik
b110374714 apply renames from jinja branch 2025-01-20 23:59:01 +00:00
ochafik
9bab6939cd Merge branch 'jinja' into tool-call 2025-01-20 23:55:12 +00:00
ochafik
8a7c89e60c reinstate assert on chat_templates.template_default 2025-01-20 23:44:42 +00:00
ochafik
ee475d2f51 rename: common_chat_template[s] 2025-01-20 23:42:07 +00:00
ochafik
8348c605ac Warn against missing eos / bos tokens when jinja template references them 2025-01-20 23:00:47 +00:00
ochafik
54a669e09e Guard against missing eos/bos tokens (null token otherwise throws in llama_vocab::impl::token_get_attr) 2025-01-20 22:50:08 +00:00
ochafik
099f983949 Merge remote-tracking branch 'origin/master' into jinja 2025-01-20 21:58:04 +00:00
ochafik
154bfaaa39 Refactor chat template validation 2025-01-20 21:54:34 +00:00
ochafik
8c84aefd4d Update --chat-template-file w/ recent change to --chat-template 2025-01-20 21:48:31 +00:00
ochafik
c9e8fdd70e Move chat_templates inside server_context + remove mutex 2025-01-20 21:25:18 +00:00
ochafik
db9dd0c1ac Finish suggested renamings 2025-01-20 21:06:18 +00:00
Olivier Chafik
153e852411
Apply suggestions from code review
Co-authored-by: Xuan Son Nguyen <thichthat@gmail.com>
Co-authored-by: Georgi Gerganov <ggerganov@gmail.com>
2025-01-20 20:55:52 +00:00
Georgi Gerganov
80d0d6b4b7
common : add -hfd option for the draft model (#11318)
* common : add -hfd option for the draft model

* cont : fix env var

* cont : more fixes
2025-01-20 22:29:43 +02:00
Jeff Bolz
aea8ddd516
vulkan: fix coopmat2 validation failures (#11284)
mul mat and flash attention shaders were loading f32 types directly into
A/B matrices, which happens to work but is technically invalid usage.
For FA, we can load it as an Accumulator matrix and convert and this
is not in the inner loop and is cheap enough. For mul mat, it's more
efficient to do this conversion in a separate pass and have the input(s)
be f16.

coopmat2 requires SPIR-V 1.6 (related using to LocalSizeId). LocalSizeId
requires maintenance4 be enabled, and SPIR-V 1.6 requires Vulkan 1.3.
2025-01-20 10:38:32 -06:00
Georgi Gerganov
9f7add1cde
examples : fix add_special conditions (#11311) 2025-01-20 16:36:08 +02:00
Christopher Nielsen
90d987b105
mmap: add include for cerrno (#11296)
ggml-ci

Co-authored-by: Xuan Son Nguyen <son@huggingface.co>
2025-01-20 16:02:43 +02:00
Michael Podvitskiy
a4251edd6f
cmake: fix shell command quoting in build-info script (#11309) 2025-01-20 16:02:15 +02:00