Commit graph

  • 2efa0c27bf tool-call: add weather tool e2e tests ochafik 2025-01-27 15:02:09 +00:00
  • 15ec01e896 jinja: only add special tokens if template doesn't seem to handle them ochafik 2025-01-27 14:28:11 +00:00
  • 226d59270b rm trailing spaces Xuan Son Nguyen 2025-01-27 15:26:01 +01:00
  • e5aeb423a5 fix bad merging Xuan Son Nguyen 2025-01-27 15:24:46 +01:00
  • da606d8d41 tool-call: remove nonsensical code_interpreter code ochafik 2025-01-27 14:19:20 +00:00
  • 610b3ac3cd ggml : x2 speed for WASM by optimizing SIMD Xuan Son Nguyen 2025-01-27 15:04:44 +01:00
  • d6d24cd9ed
    AMD: parse the architecture as supplied by gcnArchName (#11244) b4567 Haus1 2025-01-27 08:58:17 -05:00
  • a5203b4465
    llama : minor fixes for up llama load model speed (#11448) b4566 lexasub 2025-01-27 17:42:09 +04:00
  • 723fc66511
    Update src/llama-vocab.cpp Diego Devesa 2025-01-27 14:24:08 +01:00
  • e665b57fa2
    Merge branch 'master' into gg/llama-kv-cache Georgi Gerganov 2025-01-27 14:00:56 +02:00
  • bddc1bebcc tool-call: fix special handling of special trigger tokens (Nemo) ochafik 2025-01-27 11:37:41 +00:00
  • 7c2b924232 llama_model_loader::init_mapping - replace new llama_mmap to std::make_unique<llama_mmap> for clean code & reduce (/2) time of running init_mappings lexasub 2025-01-27 15:10:54 +04:00
  • 5144a18e67 impl::load change map bpe_ranks to onordered map for reduce time of impl::load on 30% lexasub 2025-01-27 15:09:34 +04:00
  • df984e0147
    llama: refactor llama_decode_impl (#11381) b4565 Johannes Gäßler 2025-01-27 12:07:12 +01:00
  • 533b4473dc Fix token shift for RWKV6Qwen2 Molly Sophia 2025-01-27 16:45:37 +08:00
  • 1eee98f01f llama : removed unnecessary code in DeepSeek V2 implementation Stanisław Szymczyk 2025-01-27 09:32:25 +01:00
  • 93c5937249 llama : modified tensor permutations to multiply larger matrices during inference Stanisław Szymczyk 2025-01-26 22:23:13 +01:00
  • acd38efee3
    metal: Handle null returned from MTLCreateSystemDefaultDevice() (#11441) b4564 Ihar Hrachyshka 2025-01-27 02:41:59 -05:00
  • ca0c837b6a nits ochafik 2025-01-27 01:08:29 +00:00
  • 77860f97db metal: Handle null returned from MTLCreateSystemDefaultDevice() Ihar Hrachyshka 2025-01-26 18:52:31 -05:00
  • f7078cab36 tool-call: fix functionary v3.1 required test ochafik 2025-01-26 23:23:09 +00:00
  • b8005c8b72 fix makefile logic for architecture specific flags Jon Haus 2025-01-26 17:44:06 -05:00
  • bb37819954 Address PR feedback Nikita Sarychev 2025-01-26 14:44:04 -08:00
  • 9c27481ed0 Merge remote-tracking branch 'upstream/master' into Remove_obsolete_HIP_workaround Nikita Sarychev 2025-01-26 14:16:49 -08:00
  • 039c2dd134
    docker: add perplexity and bench commands to full image rare-magma 2025-01-26 23:16:19 +01:00
  • caf773f249
    docker : fix ARM build and Vulkan build (#11434) Xuan Son Nguyen 2025-01-26 22:45:32 +01:00
  • 0c90945dbd
    docker: allow installing pip packages system-wide rare-magma 2025-01-26 22:39:57 +01:00
  • 5ec4c5e4d3 reshuffle chat handlers ochafik 2025-01-26 21:38:07 +00:00
  • 43385b2ff2 sync: minja ochafik 2025-01-26 21:36:25 +00:00
  • 437ff3178c add final newline Michal Moskal 2025-01-26 13:04:36 -08:00
  • 00fcd984d5 include <cmath> for INFINITY Michal Moskal 2025-01-26 12:36:06 -08:00
  • 1afc53a338 fix warning Michal Moskal 2025-01-26 12:33:11 -08:00
  • 08fefd1d7c fix whitespace Michal Moskal 2025-01-26 12:30:02 -08:00
  • c5c43b9643 vulkan: Catch pipeline creation failure and print an error message Jeff Bolz 2025-01-26 14:07:36 -06:00
  • ae08a95fb8 vulkan: try jammy Xuan Son Nguyen 2025-01-26 19:47:10 +01:00
  • d5ba035b8b no fast fail Xuan Son Nguyen 2025-01-26 19:35:27 +01:00
  • 987a8312c1 fix pip Xuan Son Nguyen 2025-01-26 19:34:36 +01:00
  • bc5900395c AMD: parse the architecture as supplied by gcnArchName Jon Haus 2025-01-26 13:24:24 -05:00
  • a0c500b4dc
    context : prepare for abstraction Georgi Gerganov 2025-01-17 21:11:03 +02:00
  • 99422dfa3f
    context : introduce llama_batch_manager Georgi Gerganov 2025-01-17 20:30:16 +02:00
  • cb8f2095c6
    wip Georgi Gerganov 2025-01-17 19:37:52 +02:00
  • 133ad6a723
    context : initial need_reserve logic Georgi Gerganov 2025-01-17 14:42:09 +02:00
  • c75ba6851e
    context : move adapter code in the implementation [no ci] Georgi Gerganov 2025-01-17 12:41:16 +02:00
  • f0713498fd
    context : add get_ctx_padding() Georgi Gerganov 2025-01-17 11:51:35 +02:00
  • b4ec1d4429
    cont : move kv_self update to llama_context Georgi Gerganov 2025-01-16 21:55:12 +02:00
  • f2524c0e41
    llama : remove references to llama_kv_cache (wip) Georgi Gerganov 2025-01-16 15:04:14 +02:00
  • ae274f9747
    llama : fix names [no ci] Georgi Gerganov 2025-01-15 13:35:56 +02:00
  • a19f671fe0
    context : minor Georgi Gerganov 2025-01-15 10:54:21 +02:00
  • 17b363afd3
    llama : update llama_kv_self API Georgi Gerganov 2025-01-14 16:47:34 +02:00
  • efc36c9acf add $LLGUIDANCE_LOG_LEVEL support Michal Moskal 2025-01-26 10:15:22 -08:00
  • fd05ab87aa
    kv_cache : move state read/write to llama_kv_cache Georgi Gerganov 2025-01-14 13:13:35 +02:00
  • 4cd1b6fa4c
    context : prepare kv_cache_read/write to be moved to kv_cache Georgi Gerganov 2025-01-14 12:33:13 +02:00
  • 73a14eccc9
    kv_cache : minor Georgi Gerganov 2025-01-14 11:56:53 +02:00
  • fef90cb3d7
    kv_cache : fix Georgi Gerganov 2025-01-13 15:58:20 +02:00
  • 4d7bd03e65
    kv_cache : functions -> members Georgi Gerganov 2025-01-13 15:50:39 +02:00
  • e4550fbafc
    llama : cont Georgi Gerganov 2025-01-13 14:56:52 +02:00
  • f78b396ee7
    llama : add struct llama_kv_cache (wip) [no ci] Georgi Gerganov 2025-01-13 14:13:11 +02:00
  • c9e9853e6c format file Michal Moskal 2025-01-26 10:11:39 -08:00
  • 44e1973af0 update llg Michal Moskal 2025-01-26 10:09:57 -08:00
  • ca88ce7b77 llama_tokenizer() in fact requires valid utf8 Michal Moskal 2025-01-26 10:09:51 -08:00
  • 178a7eb952
    metal : use residency sets (#11427) b4562 Georgi Gerganov 2025-01-26 20:06:16 +02:00
  • 853cbbed72 build arm64/amd64 separatedly Xuan Son Nguyen 2025-01-26 19:02:55 +01:00
  • 8e027f8dcd align tests with LLG grammar syntax and JSON Schema spec Michal Moskal 2025-01-26 09:59:31 -08:00
  • a2d5852ceb ci : do not fail-fast for docker Xuan Son Nguyen 2025-01-26 18:40:34 +01:00
  • 225d2e0ca1
    metal : fix build + clean-up Georgi Gerganov 2025-01-26 19:35:02 +02:00
  • 9dc5ef45d8
    metal : check env GGML_METAL_NO_RESIDENCY Georgi Gerganov 2025-01-26 19:31:08 +02:00
  • 202f323e66 llama : add a second copy of c^KV cache in DeepSeek2 MLA to avoid transposing the cache during inference Stanisław Szymczyk 2025-01-26 18:29:54 +01:00
  • 6f53d8a6b4
    docker: add missing vulkan library to base layer and update to 24.04 (#11422) Nuno 2025-01-26 18:22:43 +01:00
  • 0a211fcb9d add gh action for llg test Michal Moskal 2025-01-26 09:06:38 -08:00
  • c7ebf57822 rename llguidance test file to test-grammar-llguidance.cpp Michal Moskal 2025-01-26 08:54:56 -08:00
  • 29375376fe conditionally include llguidance test based on LLAMA_LLGUIDANCE flag Michal Moskal 2025-01-26 08:53:49 -08:00
  • 16a5484048 gbnf -> lark syntax Michal Moskal 2025-01-26 08:50:59 -08:00
  • f245ca26f5 build and run test Michal Moskal 2025-01-26 08:49:05 -08:00
  • 036b91fbc3 fix ref-count bug Michal Moskal 2025-01-26 08:48:53 -08:00
  • 114eda130f
    Merge 924c832461 into 19f65187cb fedric95 2025-01-26 16:42:58 +00:00
  • 87b78eea62
    Merge 928aa66a92 into 19f65187cb Brian 2025-01-26 16:39:46 +00:00
  • ea41ab4736
    Merge 6ee4d4c1f2 into 19f65187cb Jeff Price 2025-01-26 16:38:08 +00:00
  • aeeb88094c
    Merge 17db6beda4 into 19f65187cb Robert 2025-01-26 16:35:29 +00:00
  • bb3d86bea2
    Merge 6d126d0acc into 19f65187cb staviq 2025-01-26 16:34:45 +00:00
  • 56b7efa3e1
    Merge b51ae5eecb into 19f65187cb chrismrutherford 2025-01-26 16:32:16 +00:00
  • 6781038629
    Merge 4e23f8a81b into 19f65187cb Chad Brewbaker 2025-01-26 16:31:03 +00:00
  • 8da041c7b7
    Merge ee1c6a4d89 into 19f65187cb Amit Kumar Jha 2025-01-26 16:30:29 +00:00
  • a1da890331
    Merge 7e492b3e0e into 19f65187cb Marko Tasic 2025-01-26 16:29:53 +00:00
  • 32bce94b6b
    Merge 3277bb88e5 into 19f65187cb hackingthekernel 2025-01-26 16:26:46 +00:00
  • 4b7260f26b
    Merge ddd9971236 into 19f65187cb Yazan Agha-Schrader 2025-01-26 16:26:40 +00:00
  • 58006ddb13 clang fmt Michal Moskal 2025-01-26 08:20:26 -08:00
  • 3675050804 copy test-grammar-integration.cpp to test-llguidance.cpp Michal Moskal 2025-01-26 08:18:10 -08:00
  • a7be6669b1 pass vocab not model to llama_sampler_init_llg() Michal Moskal 2025-01-26 08:16:56 -08:00
  • 19f65187cb
    cmake: add ggml find package (#11369) b4560 bandoti 2025-01-26 12:07:48 -04:00
  • de269a1833 fix tests when llg is enabled Michal Moskal 2025-01-26 08:02:37 -08:00
  • 11594557e3 Merge branch 'tool-call' into tool-call-handler ochafik 2025-01-26 15:32:53 +00:00
  • 3f3fc03983 nit: trailing spaces ochafik 2025-01-26 15:32:13 +00:00
  • 1d8ee06000
    rpc: fix register position (#11424) b4559 Frank Mai 2025-01-26 23:20:34 +08:00
  • b9126fe364
    metal : release descriptors Georgi Gerganov 2025-01-26 16:38:54 +02:00
  • 7fb39e39e9
    metal : restore commandBufferWithUnretainedReferences calls [no ci] Georgi Gerganov 2025-01-26 16:32:41 +02:00
  • 2674f02e4f
    metal : use residency sets Georgi Gerganov 2025-01-26 12:33:16 +02:00
  • 118b4f08a8 Fix array length mismatches Rémy O 2025-01-25 12:20:35 +01:00
  • 10cc151af5 port failing dequant callbacks from mul_mm Jeff Bolz 2025-01-23 13:58:21 -06:00
  • 097befd022 vulkan: vertically realign code Rémy O 2025-01-22 20:52:03 +01:00
  • 520e6b1626 vulkan: initial support for IQ2_S Rémy O 2025-01-19 16:16:24 +01:00