Commit graph

  • b75aa7388b update hfh version. Vaibhav Srivastav 2024-09-30 16:15:19 +02:00
  • 8428576373 update transfomers version. Vaibhav Srivastav 2024-09-30 16:04:04 +02:00
  • a75c5c4295 model is loadable Xuan Son Nguyen 2024-09-30 13:53:36 +02:00
  • 8277a817f1
    console : utf-8 fix for windows stdin (#9690) b3849 Ruchira Hasaranga 2024-09-30 13:53:42 +05:30
  • b041ef4907
    Update common/console.cpp Georgi Gerganov 2024-09-30 11:23:33 +03:00
  • d9451fd647 antiprompts: avoid c++20 struct initializers in test ochafik 2024-09-30 04:08:55 +01:00
  • 0fc5ad7ae1 minja: avoid c++20 struct initializers in test ochafik 2024-09-30 03:51:48 +01:00
  • 277f38536c minja: attempt to handle windows' crlf ochafik 2024-09-30 03:45:50 +01:00
  • a9400f1442 mtgpu: enable docker workflow Xiaodong Ye 2024-09-30 10:32:10 +08:00
  • a261c1ade0 mtgpu: add docker image support Xiaodong Ye 2024-09-30 09:01:12 +08:00
  • dd34db2636 Update GGML_ASSERT Andrei Betlen 2024-09-29 16:15:30 -04:00
  • d07387ca9c server: speed up cancel test setup ochafik 2024-09-29 21:04:27 +01:00
  • 0e9c4bf5af server: update log ochafik 2024-09-29 21:02:18 +01:00
  • 0875f7686e Merge branch 'master' of https://github.com/ggerganov/llama.cpp into add-paligemma-support Andrei Betlen 2024-09-29 15:11:44 -04:00
  • f1fb9141d2 Update branch Andrei Betlen 2024-09-29 15:10:27 -04:00
  • c5a0d57ee5 Update cancel.feature ochafik 2024-09-29 19:37:23 +01:00
  • c919d5db39
    ggml : define missing HWCAP flags (#9684) b3848 Georgi Gerganov 2024-09-29 21:18:23 +03:00
  • d0b1d663e4
    sync : ggml b3847 Georgi Gerganov 2024-09-29 21:16:07 +03:00
  • aaa4099925
    CUDA: remove bad assert (ggml/972) Johannes Gäßler 2024-09-29 19:56:17 +02:00
  • 641002fba8
    vulkan : multithread pipeline creation (ggml/963) Jeff Bolz 2024-09-29 11:50:17 -05:00
  • 0de8b203f1
    vulkan : fix build for GGML_VULKAN_RUN_TESTS, add TFLOPS to log (ggml/961) Jeff Bolz 2024-09-27 02:58:01 -05:00
  • 544f409b4b
    vulkan : argsort barriers must be under uniform control flow (ggml/951) Salvatore Mesoraca 2024-09-26 08:59:42 +02:00
  • 6084bfb261
    ggml : fix GGML_MAX_N_THREADS + improve formatting (ggml/969) Georgi Gerganov 2024-09-24 13:23:59 +03:00
  • 231a5e4914 server: fix seed in tests (comma creates a tuple) ochafik 2024-09-29 19:13:27 +01:00
  • 3f96ab04a6 server: fix cancel tests ochafik 2024-09-29 19:12:59 +01:00
  • 88c9b5497a server: simplify handle_tasks signature ochafik 2024-09-29 19:01:48 +01:00
  • 18f8352586 utf-8 fix for windows stdin Ruchira Hasaranga 2024-09-29 23:30:18 +05:30
  • 419e9952c9 server: rm superfluous is_alive check in streamed code ochafik 2024-09-29 17:11:53 +01:00
  • cd806a7e88 add llava to conversion Xuan Son Nguyen 2024-09-29 16:28:16 +02:00
  • faac0bae26
    common : ensure llama_batch size does not exceed max size (#9668) b3841 matiaslin 2024-09-29 05:25:00 -07:00
  • f99d3f8367
    py : add model class for Chameleon conversion (#9683) nopperl 2024-09-29 12:02:06 +00:00
  • 922bf99a7e
    ggml : define missing HWCAP flags Willy Tarreau 2024-09-29 14:57:14 +03:00
  • 589b48d41e
    contrib : add Resources section (#9675) Georgi Gerganov 2024-09-29 14:38:18 +03:00
  • 1ddd9ab795 use new model class for chameleon conversion nopperl 2024-09-29 10:59:30 +02:00
  • 515601982d fix type caitianchi 2024-09-29 15:35:09 +08:00
  • 61b3893165 flake.lock: Update github-actions[bot] 2024-09-29 00:40:10 +00:00
  • 5f00747a90 server: test request cancellation (WIP) ochafik 2024-09-29 01:10:18 +01:00
  • 4dcb3ea943 tests: allow artificial slowdown of sampling for tests ochafik 2024-09-29 01:09:41 +01:00
  • 1da67a395c server: support cancelling non-streamed requests ochafik 2024-09-29 01:08:16 +01:00
  • 9ac4b04aa2 tool-call: add fs_list_files to common, w/ win32 impl for msys2 build ochafik 2024-09-29 00:34:07 +01:00
  • cb7912ee74 chat-template: add phi-3.5-vision-instruct ochafik 2024-09-29 00:33:19 +01:00
  • 8738d94bbd minja: qualify std::nullptr_t type for msys2 build ochafik 2024-09-29 00:18:22 +01:00
  • c87c12168a tool-call: fix memory leak in test ochafik 2024-09-28 23:44:28 +01:00
  • 22493c8e9e tests: fix test-chat-template run from build ochafik 2024-09-28 23:31:23 +01:00
  • ad6719e2a7 tests: fix typo ochafik 2024-09-28 23:26:19 +01:00
  • a072f30a8d tests: attempt to find assets for tests run from build subfolder ochafik 2024-09-28 23:15:36 +01:00
  • bc3e0c0830 tool-call: Qwen 2.5 Instruct also requires object arguments ochafik 2024-09-28 23:05:35 +01:00
  • b10ef04d8d chat-template: tweak --chat-template error message when --jinja is set ochafik 2024-09-28 22:36:38 +01:00
  • dbda025f87 tool-call: test messages -> template -> grammar -> tool call parser ochafik 2024-09-28 22:32:47 +01:00
  • b790a7ff29 Allow simpler function calling sytax, like used with Phi-3 function calling model Don Mahurin 2024-09-28 14:10:55 -07:00
  • 8550b76f4e Move function format specification to function_tool.py Don Mahurin 2024-09-28 14:10:55 -07:00
  • 0ae1112faa agent: try to fix pyright lint ochafik 2024-09-28 20:10:08 +01:00
  • 1b32ac129f chat-template: fix test-arg ochafik 2024-09-28 20:06:10 +01:00
  • 9358d1f62c minja: fix gcc8 build of test ochafik 2024-09-28 19:50:08 +01:00
  • e6be59c2a0 antiprompts: fix gcc8 build (avoid recursive struct) ochafik 2024-09-28 19:39:52 +01:00
  • ef2a020276 tool-call: make agent async ochafik 2024-09-28 19:11:09 +01:00
  • 05bbba9f8a tool-call: only match json eagerly for Llama 3.2 ochafik 2024-09-28 19:05:10 +01:00
  • 6e0053a81b chat-template: enumerate files w/ C API rather than private using std::__fs::filesystem ochafik 2024-09-28 18:47:11 +01:00
  • c657857e21 tool-call: cleanup tools.py ochafik 2024-09-28 18:31:51 +01:00
  • 55cf337560 tool-call: better error reporting for server tests ochafik 2024-09-28 18:31:22 +01:00
  • 7cef90cf9c tool-call: more eager function call parsing for Functionary & Llama (give a chance to 3B model) ochafik 2024-09-28 18:30:59 +01:00
  • 8b2cf3509f tool-call: fix grammar trigger crash ochafik 2024-09-28 18:30:01 +01:00
  • d983516f40 tool-call: let the tool call handler expand chat template, moving builtin_tools down as extra_context ochafik 2024-09-28 17:46:36 +01:00
  • 0c85bc7a8f tool-call: test tool call style detection ochafik 2024-09-28 17:43:09 +01:00
  • c197f0fcbd common: ensure token addition to batch does not exceed llama_batch size Matias Lin 2024-09-27 10:03:33 -07:00
  • f4d2b8846a
    llama : add reranking support (#9510) Georgi Gerganov 2024-09-28 17:42:03 +03:00
  • 5d0251def0
    Merge c9ae1916ec into 1b2f992cd2 Chen Xi 2024-09-28 09:10:12 -04:00
  • d3f44a41a8
    contrib : add Resources section Georgi Gerganov 2024-09-28 15:36:06 +03:00
  • 1b2f992cd2
    test-backend-ops : use flops for some performance tests (#9657) b3837 slaren 2024-09-28 14:32:46 +02:00
  • aeac876864
    Merge branch 'master' into gg/rerank Georgi Gerganov 2024-09-28 15:15:29 +03:00
  • 739842703e
    llama : add comment about thread-safety [no ci] (#9449) Georgi Gerganov 2024-09-28 15:13:21 +03:00
  • 6102037bbb
    vocab : refactor tokenizer to reduce init overhead (#9449) b3835 Zhenwei Jin 2024-09-28 20:10:58 +08:00
  • 9a913110cf
    llama : add support for Chameleon (#8543) b3834 nopperl 2024-09-28 12:08:43 +00:00
  • 43bcdd9703
    readme : add tool (#9655) Aarni Koskela 2024-09-28 15:07:14 +03:00
  • 6a0f779484
    ggml : add run-time detection of neon, i8mm and sve (#9331) b3832 Dan Johansson 2024-09-28 14:06:16 +02:00
  • 39167b69c0
    llama : fix comment [no ci] Georgi Gerganov 2024-09-28 14:51:57 +03:00
  • 89f9944981
    Enable use to the rebar feature to upload buffers to the device. (#9251) b3831 Markus Tavenrath 2024-09-28 12:05:05 +02:00
  • 287f83d244 run each test for at least one second, simplify perf cases slaren 2024-09-28 03:00:34 +02:00
  • 887951beb0 minja: generate chat goldens w/ fixed date to support Llama-3.2-3B-Instruct (uses strftime_now) ochafik 2024-09-27 19:52:15 +01:00
  • 701b664551 minja: add indent filter to support command-r-plus's chat templates ochafik 2024-09-27 19:00:14 +01:00
  • b5de3b74a5
    readme : update hot topics Georgi Gerganov 2024-09-27 20:57:51 +03:00
  • 0093a5e527 minja: fix identifiers parsing (when start w/ not/is/etc) and lstrip_blocks corner case (needed by DeepSeek-V2.5 ochafik 2024-09-27 18:30:44 +01:00
  • a4ac45f659 update server docs Xuan Son Nguyen 2024-09-27 15:30:41 +02:00
  • 0d6f6a799f add --reranking argument Xuan Son Nguyen 2024-09-27 15:25:39 +02:00
  • 84b0af8355
    Update examples/server/server.cpp Georgi Gerganov 2024-09-27 10:46:37 +03:00
  • 44f59b4301
    cmake : add option for common library (#9661) b3829 Borislav Stanimirov 2024-09-27 10:42:06 +03:00
  • 2f25ee30ef Update README.md ochafik 2024-09-27 07:18:07 +01:00
  • 86e4f99092 Update README.md ochafik 2024-09-27 07:15:25 +01:00
  • e62b5de3cf tool-call: fix functionary-small-3.2 (first tool starts w/ name\n, subsequent are >>>name\n) ochafik 2024-09-27 07:06:33 +01:00
  • e33b342da7 tool-call: fix passing of tools to template + allow agent to finish ochafik 2024-09-27 06:24:22 +01:00
  • f62e688387 tool-call: fix crash / test non-tool call case (added llama_sampler_is_grammar_empty) ochafik 2024-09-27 06:04:41 +01:00
  • 0abfa36ca7 tool-call: move usage examples to examples/agent ochafik 2024-09-27 05:10:30 +01:00
  • 6610ecf965 server: rm bad debug code ochafik 2024-09-27 04:07:35 +01:00
  • 27cd07a056 json: fix grammar conversion typo ochafik 2024-09-27 03:57:48 +01:00
  • 9295ca95db tool-call: fix agent type lints ochafik 2024-09-27 03:53:56 +01:00
  • 1e5c0e747e chat-template: fix jinja tests (make safe a passthrough) ochafik 2024-09-27 03:50:04 +01:00
  • f9c1743bb5 minja: fix iterables ochafik 2024-09-27 03:36:49 +01:00
  • 8299fac07c tool-call: adapt very simple agent + docker isolation from https://github.com/ggerganov/llama.cpp/pull/6389 ochafik 2024-09-26 21:07:46 +01:00
  • 10f9fe8d49 tool-call: fix tool call return format ochafik 2024-09-26 21:01:04 +01:00
  • c88c932d98 fix gcc error + lint ochafik 2024-09-26 19:18:40 +01:00