Commit graph

  • c67cc9837d
    ggml: aarch64: implement SVE kernels for q4_K_q8_K vector dot (#11227) b4491 fj-y-saito 2025-01-16 18:11:49 +09:00
  • aa11846752
    Update build.yml jiahao su 2025-01-16 14:19:38 +08:00
  • c69f9baf2b update readme caitianchi 2025-01-16 14:16:38 +08:00
  • 52c6abe0df no use make caitianchi 2025-01-16 14:11:41 +08:00
  • ed02c70ce4 update readme caitianchi 2025-01-16 13:57:36 +08:00
  • 9db9b205b6
    Merge 6acdb265fc into adc5dd92e8 Milot Mirdita 2025-01-16 05:23:38 +00:00
  • 91dcd6301c
    Modify cann image version jiahao su 2025-01-16 11:16:14 +08:00
  • 10b5b2b3fc
    Update build.yml jiahao su 2025-01-16 10:38:24 +08:00
  • f77d5ce232
    Merge branch 'ggerganov:master' into master jiahao su 2025-01-16 10:27:59 +08:00
  • deab32760a fix sanitizer compile Johannes Gäßler 2025-01-15 21:15:16 +01:00
  • adc5dd92e8
    vulkan: scale caching for k quants + misc fixes (#11081) Eve 2025-01-15 19:50:13 +00:00
  • 0334f6bfd3 CUDA: backwards pass for misc. ops, add tests Johannes Gäßler 2025-01-12 15:08:28 +01:00
  • 1782462bd4 llama : add llama_model_load_from_splits Xuan Son Nguyen 2025-01-15 17:55:37 +01:00
  • f11cfdfd7f
    ci : use -no-cnv in gguf-split tests (#11254) Georgi Gerganov 2025-01-15 18:28:35 +02:00
  • 3e794f2eba
    scripts : fix [no ci] Georgi Gerganov 2025-01-15 18:28:12 +02:00
  • d8956a4ce3 Remove unused variable nscipione 2025-01-15 16:53:38 +01:00
  • b0f14c5c2a Formatting nscipione 2025-01-15 15:13:44 +00:00
  • 414a66f6b1
    ci : use -no-cnv in requantize tests Georgi Gerganov 2025-01-15 16:47:13 +02:00
  • 9e839a54d6
    ci : use -no-cnv in gguf-split tests Georgi Gerganov 2025-01-15 16:25:38 +02:00
  • 492eaad571
    ci : change python3 -> python gg/ci-python Georgi Gerganov 2025-01-15 16:18:56 +02:00
  • 6b77639258 Reorder member variable to avoid warning on initialization nscipione 2025-01-15 14:47:40 +01:00
  • 1d8504338e
    fix: ggml: fix vulkan-shaders-gen build (#10448) b4488 Junil Kim 2025-01-15 22:17:42 +09:00
  • 432df2d5f9
    RoPE: fix back, CUDA support for back + noncont. (#11240) b4487 Johannes Gäßler 2025-01-15 12:51:37 +01:00
  • c87818ee71 fix comments reg. non-cont. RoPE support [no-ci] Johannes Gäßler 2025-01-15 10:12:53 +01:00
  • ee11dea6d6 Remove unnecessary headers and cast nscipione 2025-01-15 10:05:48 +01:00
  • f4d1fbc79f refactor: GGML_VULKAN_SHADERS_GEN_TOOLCHAIN Junil Kim 2025-01-15 19:16:08 +09:00
  • 0afea98ef0 Implement host pool for matrix_info nscipione 2025-01-14 16:07:13 +00:00
  • a58d32dfe6 fix lint RunningLeon 2025-01-15 14:36:43 +08:00
  • 0ccd7f3eb2
    examples : add embd_to_audio to tts-outetts.py [no ci] (#11235) Daniel Bevenius 2025-01-15 05:44:38 +01:00
  • ee3fae86d5 add readme caitianchi 2025-01-15 12:20:31 +08:00
  • f446c2cf6a
    SYCL: Add gated linear attention kernel (#11175) b4485 Akarshan Biswas 2025-01-15 08:50:17 +05:30
  • 964f81173d
    Update ggml/src/ggml-cpu/ggml-cpu-quants.c fj-y-saito 2025-01-15 12:13:17 +09:00
  • 6fdbf07181 refactor: Rename host_toolchain.cmake.in Junil Kim 2025-01-15 09:14:00 +09:00
  • b4d92a59a2
    ci : add -no-cnv for tests (#11238) Xuan Son Nguyen 2025-01-14 15:42:23 +01:00
  • 07fe2cbb26 ci : add -no-cnv for tests Xuan Son Nguyen 2025-01-14 15:25:23 +01:00
  • a899673346 Added chat template support to llama-run Michael Engel 2025-01-13 14:02:13 +01:00
  • aee5ac4e13 RoPE: fix back, CUDA support for back + noncont. Johannes Gäßler 2025-01-13 19:03:12 +01:00
  • 160d6eee9a examples : add embd_to_audio to tts-outetts.py [no ci] Daniel Bevenius 2025-01-14 13:08:47 +01:00
  • 3ed670b6dd Merge remote-tracking branch 'origin/master' into jinja Olivier Chafik 2025-01-14 12:17:07 +00:00
  • d47f40caea Update test-chat-template.cpp Olivier Chafik 2025-01-14 12:14:39 +00:00
  • 010726ce17 Merge remote-tracking branch 'origin/master' into tool-call Olivier Chafik 2025-01-14 12:12:14 +00:00
  • e183fa9e7e Update test-chat-template.cpp Olivier Chafik 2025-01-14 12:11:33 +00:00
  • e3cf4dc384 Merge remote-tracking branch 'origin/master' into cuda-releases Olivier Chafik 2025-01-14 12:09:33 +00:00
  • 91e4fc1c0c support internlm3 RunningLeon 2025-01-06 20:32:10 +08:00
  • bbf3e55e35
    vocab : add dummy tokens for "no_vocab" type (#11231) Georgi Gerganov 2025-01-14 12:54:58 +02:00
  • c5bf0d1bd7
    server : Improve code snippets direction between RTL text (#11221) ebraminio 2025-01-14 14:09:33 +03:30
  • 091592d758
    Refactor test-chat-template.cpp (#11224) b4481 Olivier Chafik 2025-01-14 10:16:41 +00:00
  • 44d1e796d0
    sync : ggml Georgi Gerganov 2025-01-14 10:39:42 +02:00
  • 0cf9a06799
    vocab : minor [no ci] gg/vocab-fix-no-vocab Georgi Gerganov 2025-01-14 10:36:18 +02:00
  • 69fc940d9a
    vocab : add dummy tokens for "no_vocab" type Georgi Gerganov 2025-01-14 10:26:47 +02:00
  • a4f3f5d8e6
    scripts : sync gguf (cont) Georgi Gerganov 2025-01-14 09:40:15 +02:00
  • 48e1ae0e61
    scripts : sync gguf Georgi Gerganov 2025-01-14 09:36:58 +02:00
  • d00a80e89d
    scripts : sync opencl Georgi Gerganov 2025-01-14 09:19:58 +02:00
  • 5f98b7c31b Add SVE support for q4_K_q8_K y-saito fujitsu 2025-01-14 13:46:34 +09:00
  • 6664d4709f cleanup pr and remove explicit floats VJHack 2025-01-13 22:21:38 -06:00
  • f08e6f5bdc removed commented tests VJHack 2025-01-13 20:32:08 -06:00
  • 66ec17eb53 Update build.yml ochafik 2025-01-14 02:20:50 +00:00
  • 41b4b11340 Fix container tags + rename ochafik 2025-01-14 02:19:41 +00:00
  • 4232406510 Merge remote-tracking branch 'origin/master' into cuda-releases ochafik 2025-01-14 01:27:45 +00:00
  • 7a7d6f6a22 Fix merge ochafik 2025-01-14 01:14:35 +00:00
  • b29deb83cd format VJHack 2025-01-13 19:08:30 -06:00
  • 0f7501c913 format VJHack 2025-01-13 19:05:37 -06:00
  • a590dcb7f6 format VJHack 2025-01-13 19:05:15 -06:00
  • 66cffa8aff resolve merge conflicts VJHack 2025-01-13 19:02:04 -06:00
  • e7ff6ecd93 Merge branch 'jinja' into tool-call ochafik 2025-01-14 00:55:30 +00:00
  • a54ccb910b
    Update test-chat-template.cpp Olivier Chafik 2025-01-14 00:38:16 +00:00
  • d1adeb95b7 Refactor test-chat-template ochafik 2025-01-13 20:11:27 +00:00
  • 1b3bb7eeb9
    Update arg.cpp Olivier Chafik 2025-01-14 00:07:18 +00:00
  • d905a9e9b7 cleaned up pr VJHack 2025-01-13 17:51:40 -06:00
  • 54ef105c85 added tests and fixed nsigma impl VJHack 2025-01-13 17:48:35 -06:00
  • 4daae0bfc7 Update run.cpp ochafik 2025-01-13 23:26:31 +00:00
  • a57bb94e29 Update test_chat_completion.py ochafik 2025-01-13 23:18:03 +00:00
  • b7e21710c4 Update utils.py ochafik 2025-01-13 23:11:57 +00:00
  • b4083e4155 Test chat_template in e2e test ochafik 2025-01-13 23:10:52 +00:00
  • a6afb2735f Update common_chat_format_example to use minja template wrapper ochafik 2025-01-13 22:57:35 +00:00
  • c04c50e40c Merge remote-tracking branch 'origin/master' into jinja ochafik 2025-01-13 22:26:13 +00:00
  • 8fb681bf9a updated readme VJHack 2025-01-13 16:17:39 -06:00
  • 8dd4f334a4 Add --jinja to llama-run ochafik 2025-01-13 22:07:49 +00:00
  • 18f257bf1a Fix deprecation ochafik 2025-01-13 21:30:48 +00:00
  • 7c84ebc231 Test templates w/ minja ochafik 2025-01-13 21:23:30 +00:00
  • bee4c7c9fa apply parameter to only llama-cli VJHack 2025-01-13 15:12:50 -06:00
  • da038d8715 completed top nsigma sampler implementation VJHack 2025-01-13 14:46:12 -06:00
  • 1c158d030c server : Improve code snippets direction between RTL text Ebrahim Byagowi 2025-01-13 23:18:40 +03:30
  • 1aac99ad54 Refactor test-chat-template ochafik 2025-01-13 20:11:27 +00:00
  • 78861a3eb2 Wire LLM_KV_TOKENIZER_CHAT_TEMPLATE_N in llama_model_chat_template ochafik 2025-01-13 19:58:15 +00:00
  • cb72cf1fc3 Merge remote-tracking branch 'origin/master' into jinja ochafik 2025-01-13 19:56:27 +00:00
  • 504af20ee4
    server : (UI) Improve messages bubble shape in RTL (#11220) ebraminio 2025-01-13 22:53:31 +03:30
  • 84a44815f7
    cli : auto activate conversation mode if chat template is available (#11214) b4475 Xuan Son Nguyen 2025-01-13 20:18:12 +01:00
  • 5a9f30f04f llama-server : Improve messages bubble shape in RTL Ebrahim Byagowi 2025-01-13 22:29:36 +03:30
  • 73c4e77de8 do not activate -cnv for non-instruct models Xuan Son Nguyen 2025-01-13 19:40:07 +01:00
  • ae86ff36eb llama-bench : add "test" field with test label in all output formats Stanisław Szymczyk 2025-01-13 19:05:24 +01:00
  • ec9ede2d5f
    Merge c2b26000c3 into 39509fb082 Xuan Son Nguyen 2025-01-13 18:03:20 +01:00
  • 39509fb082
    cuda : CUDA Graph Compute Function Refactor (precursor for performance improvements) (#11042) b4474 Andreas Kieslinger 2025-01-13 16:45:53 +01:00
  • dd221bdd6b update readme (2) Xuan Son Nguyen 2025-01-13 16:45:23 +01:00
  • 5226732fc4 remove double lines between functions slaren 2025-01-13 16:44:02 +01:00
  • 723d77ceba update readme (writing with the help of chatgpt) Xuan Son Nguyen 2025-01-13 16:37:53 +01:00
  • e6e9a6f52c Merge branch 'master' into xsn/cli_auto_cnv Xuan Son Nguyen 2025-01-13 16:27:01 +01:00
  • ce6bde4a6d added kalavai to infrastructure list Carlos Fernandez Musoles 2025-01-13 14:53:46 +00:00
  • 6e9ebda9ea init caitianchi 2025-01-13 22:00:39 +08:00
  • a29f0870d4
    contrib : add naming guidelines (cont) (#11177) Georgi Gerganov 2025-01-13 15:59:26 +02:00