Commit graph

  • 08296ec74d fix recv_with_timeout Xuan Son Nguyen 2025-01-22 10:44:23 +01:00
  • aa91158110 Adding logprobs to /v1/completions Jiri Podivin 2025-01-21 17:01:03 +01:00
  • 3e3357fd77
    llava : support Minicpm-omni (#11289) b4525 tc-mb 2025-01-22 15:35:48 +08:00
  • c31a3406db Vulkan-run-test: fix mmq_wg_denoms Dong Wang 2025-01-22 14:07:08 +08:00
  • 2dd09c792f more cleanups Olivier Chafik 2025-01-22 03:20:47 +00:00
  • 28cac497a6 drop llama_sampler_accept_str Olivier Chafik 2025-01-22 02:38:04 +00:00
  • e211629b89 Merge branch 'string_utils' into tool-call Olivier Chafik 2025-01-22 02:27:10 +00:00
  • 6fa26b6a20
    Exclude non PR event jiahao su 2025-01-22 10:25:36 +08:00
  • 5140d7a00b Update common.cpp Olivier Chafik 2025-01-22 02:25:09 +00:00
  • 41a613bbd3 Merge branch 'string_utils' into tool-call Olivier Chafik 2025-01-22 02:22:20 +00:00
  • 03fe80f1bb drop unused fs_list_files Olivier Chafik 2025-01-22 02:22:03 +00:00
  • 4de5cf8a10 json: refactor to surface a versatile builder Olivier Chafik 2025-01-22 02:19:23 +00:00
  • 9a5acbb4a3 Factor string_join, string_split, string_repeat into common Olivier Chafik 2025-01-22 02:17:34 +00:00
  • 9e8b43f993 follow enum naming style for tool call styles Olivier Chafik 2025-01-22 02:13:02 +00:00
  • 5268ec8947 Refactor string helpers into common Olivier Chafik 2025-01-22 02:08:18 +00:00
  • d77fecc3dc shrink diff in json conversion code Olivier Chafik 2025-01-22 01:54:17 +00:00
  • 3972945798 common_tool_call rename Olivier Chafik 2025-01-22 01:54:08 +00:00
  • ef61a4c79e minimize diffs Olivier Chafik 2025-01-22 01:46:51 +00:00
  • dbf841b0d2 Push laziness down to grammar impl Olivier Chafik 2025-01-22 01:25:54 +00:00
  • f4dab52a24 server : add more clean up when cancel_tasks is called Xuan Son Nguyen 2025-01-22 00:04:09 +01:00
  • 44ec40a43a ggml: added/removed const references for simple types and structures less 16 bytes Herman Semenov 2025-01-21 18:58:31 +03:00
  • ad38e87329 rename everywhere Xuan Son Nguyen 2025-01-21 15:53:39 +01:00
  • 77f4098c83 Delete update_jinja_goldens.py Olivier Chafik 2025-01-21 14:41:59 +00:00
  • f6e73dac43 Remove examples/agent (moved to https://gist.github.com/ochafik/9246d289b7d38d49e1ee2755698d6c79) Olivier Chafik 2025-01-21 14:41:56 +00:00
  • b49d0521e9 rm tests/test-minja from makefile Olivier Chafik 2025-01-21 14:12:38 +00:00
  • 3d63db2d7b Merge remote-tracking branch 'origin/master' into cuda-releases Olivier Chafik 2025-01-21 13:48:18 +00:00
  • fec0260366 Merge remote-tracking branch 'origin/master' into tool-call Olivier Chafik 2025-01-21 13:44:58 +00:00
  • bd0714b977 reuse LLM_ARCH and LLM_TENSOR Xuan Son Nguyen 2025-01-21 14:27:16 +01:00
  • 6171c9d258
    Add Jinja template support (#11016) b4524 Olivier Chafik 2025-01-21 13:18:51 +00:00
  • e28245f35f
    export-lora : fix tok_embd tensor (#11330) b4523 Xuan Son Nguyen 2025-01-21 14:07:12 +01:00
  • 6da5bec81c
    rpc : better caching of the base buffer pointer (#11331) b4522 Radoslav Gerganov 2025-01-21 15:06:41 +02:00
  • cbb9b819da rm unused optional header Olivier Chafik 2025-01-21 12:29:51 +00:00
  • 9151259799 rpc : better caching of the base buffer pointer Radoslav Gerganov 2025-01-21 13:44:27 +02:00
  • 510b626c03 export-lora : fix tok_embd tensor xsn/fix_lora_merge_tok_embd Xuan Son Nguyen 2025-01-21 12:29:13 +01:00
  • 49e0d99bb1
    Add 'Ascend NPU' label restrictions jiahao su 2025-01-21 19:12:13 +08:00
  • 431bb08059 change gguf KV from clip to vit Xuan Son Nguyen 2025-01-21 10:51:26 +01:00
  • 2e2f8f093c
    linenoise.cpp refactoring (#11301) b4521 Eric Curtin 2025-01-21 09:32:35 +00:00
  • 79eac2727a cpu_pnp_strategy changes savesanketsw 2025-01-21 01:10:24 -08:00
  • 27f5e8a10a use clip_image_u8_free caitianchi 2025-01-21 17:08:45 +08:00
  • 2139667ec4
    metal : fix out-of-bounds write (#11314) b4520 Georgi Gerganov 2025-01-21 08:48:13 +02:00
  • 3d47c267af vulkan: fix diag_mask_inf Jeff Bolz 2025-01-20 22:17:18 -06:00
  • c606255948 Merge branch 'jinja' into tool-call ochafik 2025-01-21 03:49:30 +00:00
  • f60e148bff shuffle actions back to original order ochafik 2025-01-21 03:28:56 +00:00
  • 9d8ebd62c6 Update minja from https://github.com/google/minja/pull/27 ochafik 2025-01-21 03:18:06 +00:00
  • 7a5b18e195 ditch ccache action + require cuda in release ochafik 2025-01-21 02:31:05 +00:00
  • b7b264cac0 ci: setup ccache ochafik 2025-01-21 01:59:47 +00:00
  • c92ae47637 ci: attempt to fix safe directory issue ochafik 2025-01-21 01:59:41 +00:00
  • ba8dd66fdf Merge branch 'jinja' into tool-call ochafik 2025-01-21 01:43:14 +00:00
  • ff2cce57ad Update minja to https://github.com/google/minja/pull/25 ochafik 2025-01-21 01:26:19 +00:00
  • 56aa93c266 fix std imports for gcc build ochafik 2025-01-21 00:08:22 +00:00
  • 7ea6a06cde Merge branch 'jinja' into tool-call ochafik 2025-01-20 23:59:24 +00:00
  • 8347da907d Update minja to b8437df626 ochafik 2025-01-20 23:59:15 +00:00
  • b110374714 apply renames from jinja branch ochafik 2025-01-20 23:59:01 +00:00
  • 9bab6939cd Merge branch 'jinja' into tool-call ochafik 2025-01-20 23:55:12 +00:00
  • 8a7c89e60c reinstate assert on chat_templates.template_default ochafik 2025-01-20 23:44:42 +00:00
  • ee475d2f51 rename: common_chat_template[s] ochafik 2025-01-20 23:42:07 +00:00
  • 8348c605ac Warn against missing eos / bos tokens when jinja template references them ochafik 2025-01-20 23:00:47 +00:00
  • 54a669e09e Guard against missing eos/bos tokens (null token otherwise throws in llama_vocab::impl::token_get_attr) ochafik 2025-01-20 22:50:08 +00:00
  • 099f983949 Merge remote-tracking branch 'origin/master' into jinja ochafik 2025-01-20 21:58:04 +00:00
  • 154bfaaa39 Refactor chat template validation ochafik 2025-01-20 21:54:34 +00:00
  • 8c84aefd4d Update --chat-template-file w/ recent change to --chat-template ochafik 2025-01-20 21:48:31 +00:00
  • c9e8fdd70e Move chat_templates inside server_context + remove mutex ochafik 2025-01-20 21:25:18 +00:00
  • db9dd0c1ac Finish suggested renamings ochafik 2025-01-20 21:06:18 +00:00
  • 153e852411
    Apply suggestions from code review Olivier Chafik 2025-01-20 20:55:52 +00:00
  • b8161015ff
    Update CMakeLists.txt Jordan Nanos 2025-01-20 12:44:39 -08:00
  • 80d0d6b4b7
    common : add -hfd option for the draft model (#11318) b4519 Georgi Gerganov 2025-01-20 22:29:43 +02:00
  • 2ec8898196
    cont : more fixes Georgi Gerganov 2025-01-20 22:16:24 +02:00
  • b14bb87d92
    cont : fix env var Georgi Gerganov 2025-01-20 21:53:48 +02:00
  • 6ef22f0547
    common : add -hfd option for the draft model Georgi Gerganov 2025-01-20 21:46:58 +02:00
  • 4165293c38
    Attempt to fix weird git error by installing deps before clone Olivier Chafik 2025-01-20 18:03:30 +00:00
  • aea8ddd516
    vulkan: fix coopmat2 validation failures (#11284) b4518 Jeff Bolz 2025-01-20 10:38:32 -06:00
  • e1349f4156 vulkan: sort shaders for more deterministic binary Jeff Bolz 2025-01-20 10:26:00 -06:00
  • 22ed6028af
    Temporarily upload artefacts in normal CI run to test artefacts Olivier Chafik 2025-01-20 15:59:29 +00:00
  • c9e7cbb08b safer jinja llama_chat_templates struct xsn/tmp_jinja_safer Xuan Son Nguyen 2025-01-20 16:58:29 +01:00
  • e014b6124e vulkan: fix coopmat2 validation failures Jeff Bolz 2025-01-17 14:15:51 -06:00
  • ac045e378e
    Update build.yml Olivier Chafik 2025-01-20 15:46:43 +00:00
  • 01ab27a8fd
    metal : fix out-of-bounds write Georgi Gerganov 2025-01-20 17:45:04 +02:00
  • b71c43c294
    Update build.yml Olivier Chafik 2025-01-20 14:52:40 +00:00
  • 9f7add1cde
    examples : fix add_special conditions (#11311) Georgi Gerganov 2025-01-20 16:36:08 +02:00
  • 90d987b105
    mmap: add include for cerrno (#11296) b4516 Christopher Nielsen 2025-01-20 09:02:43 -05:00
  • a4251edd6f
    cmake: fix shell command quoting in build-info script (#11309) Michael Podvitskiy 2025-01-20 15:02:15 +01:00
  • d8c0361ea3
    examples : fix add_special conditions Georgi Gerganov 2025-01-20 15:57:59 +02:00
  • ec7f3ac9ab
    llama : add support for Deepseek-R1-Qwen distill model (#11310) b4514 Xuan Son Nguyen 2025-01-20 14:35:07 +01:00
  • 19be0a8d68 coding style Xuan Son Nguyen 2025-01-20 14:15:59 +01:00
  • 7542b6d6d8 llama : add support for Deepseek-R1-Qwen distill model Xuan Son Nguyen 2025-01-20 14:13:25 +01:00
  • 5bd9d35eb0 cmake: refined conditions for math library linking on windows Michael Podvitskiy 2025-01-20 13:52:54 +01:00
  • abd27fc7a2
    Update build.yml Olivier Chafik 2025-01-20 12:41:04 +00:00
  • e762c0eb1d
    Update build.yml jiahao su 2025-01-20 20:33:48 +08:00
  • 3b2c8acfab
    Modify format error jiahao su 2025-01-20 19:40:15 +08:00
  • 7be066764f
    Update build.yml jiahao su 2025-01-20 19:30:56 +08:00
  • 2984d3cade Align artefact names on existing ones ochafik 2025-01-20 11:11:23 +00:00
  • 5eb87e9aa3 cuda builds: add libcurl ochafik 2025-01-20 10:51:26 +00:00
  • c85ae08922
    Update build.yml jiahao su 2025-01-20 18:37:14 +08:00
  • 2d75e56a4a Merge branch 'master' into mascguy-errno Xuan Son Nguyen 2025-01-20 11:34:44 +01:00
  • 67075cc8bd
    Merge branch 'ggerganov:master' into cuda-releases Olivier Chafik 2025-01-20 10:24:54 +00:00
  • 0fb7baff6a cmake: fix shell command quoting in build-info script Michael Podvitskiy 2025-01-20 10:53:04 +01:00
  • 25c912ac11
    Change to run on x86 system jiahao su 2025-01-20 17:14:53 +08:00
  • 448860d5e3
    Merge a899673346 into ef6dada60c Michael Engel 2025-01-20 00:26:34 -08:00
  • ef6dada60c
    cont : fix whitespaces (#11305) b4513 Georgi Gerganov 2025-01-20 09:29:32 +02:00
  • ae3c1db2f9
    llama : re-add LLM_ARCH_PHIMOE (#11305) b4512 Kyle Bruene 2025-01-20 01:21:01 -06:00