Commit graph

  • c1c64db189
    Merge 1f9316651f into c2a67efe38 Michał Moskal 2025-02-09 22:56:38 -08:00
  • 05b028b9c7
    Merge a726adaef7 into c2a67efe38 bandoti 2025-02-09 22:24:28 -08:00
  • c2a67efe38
    vulkan: Make Vulkan optional at runtime (#11493). (#11494) b4679 Danny Milosavljevic 2025-02-10 07:17:21 +01:00
  • b044a0fe3c
    vulkan: add environment variable GGML_VK_PREFER_HOST_MEMORY to avoid VRAM allocation (#11592) b4678 Wagner Bruna 2025-02-10 03:08:22 -03:00
  • 8521edb453
    Merge fa522bc346 into 19d3c8293b edwko 2025-02-10 19:00:12 +13:00
  • 1a9c2635b3 rwkv7: do not quantize small yet 2D lora weights Molly Sophia 2025-02-08 12:41:43 +08:00
  • 41a80dfb03 RWKV_WKV6 testing: avoid some weird fails Molly Sophia 2025-02-01 19:22:19 +08:00
  • b5be8ff97f remove duplicate break; Molly Sophia 2025-02-01 18:36:02 +08:00
  • 39eb446ad6 rwkv: better handling for models without gate Molly Sophia 2025-02-01 10:27:40 +08:00
  • 9cad1ca194 rwkv: skip computing output for unused tokens for hybrid models Molly Sophia 2025-01-31 15:48:36 +08:00
  • cffd099aad rwkv7: Add some model type variants Molly Sophia 2025-01-29 14:21:55 +08:00
  • 1fdc00b255 Add _set_vocab_rwkv_world as a common function Molly Sophia 2025-01-29 13:58:42 +08:00
  • 922ebbe93d rwkv7: converter script simplification Molly Sophia 2025-01-29 13:42:49 +08:00
  • 2175aebdb1 Apply code-format changes Molly Sophia 2025-01-27 21:35:45 +08:00
  • f6be4dc661 Add support for ARWKV7 Hybrid models Molly Sophia 2025-01-27 17:45:43 +08:00
  • e9ba411d3e WKV7 Vulkan bugfix Molly Sophia 2025-01-25 12:51:02 +08:00
  • 2187607471 WKV7 Metal Molly Sophia 2025-01-25 12:40:39 +08:00
  • 3a2a97af28 ggml: metal unary exp & neg Molly Sophia 2025-01-23 12:53:18 +08:00
  • d564c4b534 Fix metal wkv6 inference Molly Sophia 2025-01-23 11:55:42 +08:00
  • 65307d279f update tests for 1b6 3b 7b zhiyuan li 2024-12-27 13:47:41 +08:00
  • 84b4f81ef1 initial support for apple zhiyuan li 2024-12-27 13:38:44 +08:00
  • e7794cb274 WKV7 Vulkan & sycl Molly Sophia 2025-01-16 22:49:12 +08:00
  • 9cd24dd3eb wkv7 CUDA impl Molly Sophia 2025-01-16 15:50:56 +08:00
  • 6dcc21e7f5 WIP: Add support for rwkv v7 Molly Sophia 2025-01-15 20:43:23 +08:00
  • 5445300758 ggml: Add op l2_norm Molly Sophia 2025-01-15 20:42:40 +08:00
  • cd6a76ad24
    Merge d1bb943c10 into 19d3c8293b Aleksei Nikiforov 2025-02-10 11:50:10 +08:00
  • 7bf68a1ca6
    chore: update ggml-cpu-aarch64.cpp Ikko Eltociear Ashimine 2025-02-10 12:34:08 +09:00
  • 13763a2a9b
    Merge branch 'ggerganov:master' into master Giovanni Petrantoni 2025-02-10 11:16:22 +08:00
  • bd9b524cf8
    Update README.md pascal-lc 2025-02-10 10:23:13 +08:00
  • 2ba39877d6
    Merge 9500f4436a into 19d3c8293b xndcn 2025-02-10 10:15:12 +08:00
  • f3ee51ea17 llamafile: use member variable instead of constant for iq4nlt jmorganca 2025-02-09 18:09:21 -08:00
  • 23f770a51d
    Merge 2991954b7d into 19d3c8293b Eric Curtin 2025-02-10 03:06:31 +01:00
  • d5251da695
    Merge 1fccfc9eb6 into 19d3c8293b Emreerdog 2025-02-10 09:24:47 +08:00
  • 3ab621175a server (webui): Fix issue with muliple <think> tags Stéphane du Hamel 2025-02-10 02:06:32 +01:00
  • 8446b617c5
    Fixed if formatting Daniele 2025-02-09 23:49:17 +00:00
  • 5f7326b0c1
    vulkan: improve im2col performance Daniele 2025-02-09 23:30:13 +00:00
  • 01db429161 fix test-chat (update delta to latest r1 template change) ochafik 2025-02-09 22:58:26 +00:00
  • 95a2982558
    Merge 19d84e08a3 into 19d3c8293b lexasub 2025-02-09 17:33:10 -05:00
  • 8409bf185d fix test_calc_result & test_thoughts ochafik 2025-02-09 22:12:35 +00:00
  • 1c7a165e59 Add enum class HuapengZhou 2025-02-09 13:37:30 -08:00
  • ea2f41e0d2 add models/templates/README.md ochafik 2025-02-09 21:04:19 +00:00
  • a29dc921ec fix server test_tool_calls.py ochafik 2025-02-09 21:01:35 +00:00
  • 31cfa39811 server : fix check for URI length to prevent incorrect HTTP 414 errors Brett Profitt 2025-02-09 15:50:01 -05:00
  • 9878b58dad Add enum class HuapengZhou 2025-02-09 12:33:38 -08:00
  • e1bff8f66c update deepseek r1 templates (+ put update commands in ./scripts/get_chat_template.py's comments) ochafik 2025-02-09 20:12:28 +00:00
  • 30dcfaa57a rm wrong warning in command-r parser (when normal text) ochafik 2025-02-09 18:13:32 +00:00
  • 8d82be902e sync: minja (https://github.com/ggerganov/llama.cpp/pull/11774) ochafik 2025-02-09 18:09:26 +00:00
  • 392f99a33a sync: minja (a72057e519) ochafik 2025-02-09 18:00:42 +00:00
  • e80c485077
    Merge 56979aebea into 19d3c8293b yushihang 2025-02-09 22:56:17 +05:30
  • cfb0ae7e4c ggml : fix more imatrix nan cases sl/more-imatrix-nan-fixes slaren 2025-02-09 18:15:02 +01:00
  • 1be357d990 Merge branch 'master' into compilade/imatrix-batched-chunks compilade/imatrix-batched-chunks Francis Couture-Harpin 2025-02-09 12:06:24 -05:00
  • db502ddd0e Merge branch 'master' into compilade/imatrix-batched-chunks Francis Couture-Harpin 2025-02-09 12:06:15 -05:00
  • 8c871fb8c9
    Merge a2da5a2c0b into 19d3c8293b Brian 2025-02-09 08:46:39 -08:00
  • d4bdfc6314 better way to disable for arm sl/mmid-cpu-perf slaren 2025-02-09 17:20:09 +01:00
  • 1198aecd3f
    Merge de9d2c6f09 into 19d3c8293b Diego Devesa 2025-02-09 19:02:44 +03:00
  • 91542ca245 tool-calls: allow r1 output to miss <think> opening tag (since latest template update adds it) ochafik 2025-02-09 15:50:21 +00:00
  • e598e7aa10 sync: minja (https://github.com/google/minja/pull/52) ochafik 2025-02-09 15:49:52 +00:00
  • 2d493d26ab Merge remote-tracking branch 'origin/master' into sl/mmid-cpu-perf slaren 2025-02-09 16:27:24 +01:00
  • b26af62e7e cleanup slaren 2025-02-09 16:27:19 +01:00
  • 1b90527d78 disable for arm slaren 2025-02-09 16:22:56 +01:00
  • 0f0d8c3ae7 allocate chunk counter in wdata parallelize src1 quantization by column to allows parallelization even when there is only one row slaren 2025-02-05 16:17:33 +01:00
  • 434d9e86e7
    Merge 4f74deacea into 19d3c8293b yushihang 2025-02-09 14:05:52 +01:00
  • 0b3b646631 docs: utilize the forward slash (/) as the path separator for Unix-like systems jason_w 2025-02-09 20:44:04 +08:00
  • 52007f933a
    Merge 8e69669007 into 19d3c8293b Mika Pi 2025-02-09 21:22:46 +09:00
  • 6718968d4e
    Merge e00c9d1c5e into 19d3c8293b Eric Curtin 2025-02-09 03:23:51 -08:00
  • 19d3c8293b
    There's a better way of clearing lines (#11756) b4677 Eric Curtin 2025-02-09 10:34:49 +00:00
  • dfaa96c519 devops: increase timeout of Vulkan tests again Rémy O 2025-02-01 19:58:00 +01:00
  • ab2f56e270 vulkan: define MMV kernels for IQ1 quantizations Rémy O 2025-02-01 19:27:21 +01:00
  • fa92caae18 vulkan: initial support for IQ1_S and IQ1_M quantizations Rémy O 2025-01-30 04:28:49 +01:00
  • 941efc054f tests: remove invalid test-backend-ops REPEAT_BACK tests Rémy O 2025-02-09 10:14:32 +01:00
  • bc349762d8 vulkan: implement GGML_OP_REPEAT_BACK Rémy O 2025-02-09 10:07:20 +01:00
  • e6a2c06bbb vulkan: fix check_results RWKV_WKV6 crash and memory leaks Rémy O 2025-02-08 20:32:05 +01:00
  • 9526033b71 vulkan: implement GGML_OP_OPT_STEP_ADAMW Rémy O 2025-02-08 19:41:06 +01:00
  • 095f8d17ac vulkan: implement GGML_OP_COUNT_EQUAL Rémy O 2025-02-08 14:01:27 +01:00
  • 148f58681b vulkan: implement GGML_OP_SUB Rémy O 2025-02-08 13:40:29 +01:00
  • deb15e3f53 vulkan: implement GGML_OP_ARGMAX Rémy O 2025-02-08 11:58:36 +01:00
  • abf4c2ef74 vulkan: support GGML_OP_SUM Rémy O 2025-02-08 10:47:05 +01:00
  • 5c1d8a946f vulkan: support memset_tensor Rémy O 2025-02-08 10:46:51 +01:00
  • bfc2e9f4ea
    Merge a279f17815 into 98f6b0fd1e pancake 2025-02-09 16:58:48 +08:00
  • ded0672be8
    Merge ccfdca810e into 98f6b0fd1e Mr-Thack 2025-02-09 14:17:29 +05:30
  • 185e1b107e Added GPU support on qwen2vl readme Undo changes on qwen2vl-cli sami 2025-02-09 15:41:11 +07:00
  • 98f6b0fd1e
    vulkan: account for lookup tables when checking shared memory size (#11502) b4676 Jeff Bolz 2025-02-09 01:43:51 -06:00
  • e491f3bb9e
    Merge df11fb7033 into 55ac8c7791 Daniel J Walsh 2025-02-09 18:05:16 +13:00
  • c679d59795
    Merge 5cb6209de5 into 55ac8c7791 Leonard 2025-02-09 02:57:11 +01:00
  • 95cddfd8fb rm thoughts from generic parser ochafik 2025-02-09 01:27:58 +00:00
  • 437af24a48
    Merge 6893f3ac5d into 55ac8c7791 Brian 2025-02-08 17:36:52 -07:00
  • 2e54433a9d Merge remote-tracking branch 'origin/master' into sl/custom-tensor-offload sl/custom-tensor-offload slaren 2025-02-09 00:36:06 +01:00
  • 8770ffa60c rebuild buft list on every call slaren 2025-02-09 00:32:52 +01:00
  • 3cbdbe8947 upload vulkan x86 builds Eve 2025-02-08 16:23:14 -05:00
  • 2ab608b152 remove memset that causes buffer overflow Co-authored-by: camel-cdr <camel-cdr@protonmail.com> Xuan Son Nguyen 2025-02-08 22:19:44 +01:00
  • 6278c767ee Merge branch 'master' into xsn/wasm_simd Xuan Son Nguyen 2025-02-08 22:12:49 +01:00
  • 9feb8e351e mm subgroup size Eve 2025-02-08 16:05:15 -05:00
  • 55ac8c7791
    server : (webui) revamp Settings dialog, add Pyodide interpreter (#11759) b4675 Xuan-Son Nguyen 2025-02-08 21:54:50 +01:00
  • 495c32cbe4
    Merge 93278f84cf into e6e6583199 Don Mahurin 2025-02-09 02:19:48 +05:30
  • 7791845e2c Merge branch 'master' into xsn/webui_pyodide Xuan Son Nguyen 2025-02-08 20:15:07 +01:00
  • e6e6583199
    server : (webui) increase edit textarea size (#11763) Woof Dog 2025-02-08 19:09:55 +00:00
  • 85da9172b6 (small tweak) add small animation to make it feels like claude Xuan Son Nguyen 2025-02-08 20:01:57 +01:00
  • b829cab72f fix test-chat ochafik 2025-02-08 18:46:20 +00:00
  • 475b2906ba speed up by loading pyodide on page load Xuan Son Nguyen 2025-02-08 19:39:26 +01:00
  • a59fde2955 update model template / format mapping ochafik 2025-02-08 18:21:29 +00:00