Commit graph

  • b15fede7a9
    kv-cache : fix defrag condition Georgi Gerganov 2025-02-06 14:34:45 +02:00
  • 9ab42dc722
    docs: update fedora cuda guide for 12.8 release (#11393) Tei Home 2025-02-06 20:16:15 +08:00
  • d9959cb7af fix more latex Xuan Son Nguyen 2025-02-06 13:05:01 +01:00
  • 99bbe263f6 ggml : Fix warnings when run cpu CI locally on LoongArch Jinyang He 2025-02-06 16:31:40 +08:00
  • 45aa1db6cf ggml : optimize loongarch_asx extend i16,i8,u8 to i32,i16 Jinyang He 2025-02-06 14:05:15 +08:00
  • e6d955ebcd ggml : optimize convert f32<->f16 for loongarch_asx Jinyang He 2025-02-06 14:03:19 +08:00
  • 194b2e69f8
    SYCL: Adjust support condition for norm operators (#11674) Akarshan Biswas 2025-02-06 17:12:35 +05:30
  • 9dd7a0390f
    llama : add log about loading model tensors (#11699) Georgi Gerganov 2025-02-06 13:41:37 +02:00
  • 124df6e7c9 improve MarkdownDisplay Xuan Son Nguyen 2025-02-06 12:33:52 +01:00
  • c0d4843225
    build : fix llama.pc (#11658) b4651 Adrien Gallouët 2025-02-06 12:08:13 +01:00
  • 32b8ce5b96
    cont : better logic Georgi Gerganov 2025-02-06 13:07:10 +02:00
  • 734a808f59 CMakeLists.txt: respect CMAKE_INSTALL_LIBDIR for llama.pc Martin Jansa 2025-02-06 10:49:08 +00:00
  • 04c01e9c95
    llama : fix defrag logic Georgi Gerganov 2025-02-06 12:48:53 +02:00
  • 71235f6020 fix unused var Xuan Son Nguyen 2025-02-06 11:24:38 +01:00
  • 64c5bbae29 lint and format combined Xuan Son Nguyen 2025-02-06 11:20:01 +01:00
  • c8dc8d7f55 allow multiple generations at the same time Xuan Son Nguyen 2025-02-06 11:18:00 +01:00
  • d1d0a61b00 squash! common : add default embeddings presets Daniel Bevenius 2025-02-06 10:43:16 +01:00
  • 8d4d2be143
    ggml : fix LoongArch compile error with 128-bit SIMD (#11701) junchao-zhao 2025-02-06 17:20:00 +08:00
  • 61f410f9e4 squash! common : add default embeddings presets Daniel Bevenius 2025-02-06 09:50:22 +01:00
  • 0f1c1cab2c
    Merge branch 'master' into gg/llama-kv-cache Georgi Gerganov 2025-02-06 10:04:33 +02:00
  • e0d913fccb
    llama : clear whitespaces Georgi Gerganov 2025-02-06 10:02:50 +02:00
  • e07000b525 squash! common : add default embeddings presets [no ci] Daniel Bevenius 2025-02-06 08:55:11 +01:00
  • e00c9d1c5e Update examples/server/server.cpp fall-back-to-jinja Eric Curtin 2025-02-05 22:59:16 +00:00
  • 452be2637f Fix LoongArch compile error with 128-bit SIMD yala 2025-02-06 15:45:30 +08:00
  • 0bec4f6b4d update readme of minicpm-v caitianchi 2025-02-06 15:38:50 +08:00
  • 3b6a0a817a
    llama : add log about loading model tensors gg/llama-add-log Georgi Gerganov 2025-02-06 09:24:07 +02:00
  • 27f59dbaaa squash! llama : rename batch.logits to batch.output Daniel Bevenius 2025-02-06 08:00:30 +01:00
  • aa80c94e08
    Merge d277fdcf43 into 2c6c8df56d Mohammadreza Hendiani 2025-02-06 07:28:03 +01:00
  • 2c6c8df56d
    vulkan: optimize coopmat2 iq2/iq3 callbacks (#11521) b4649 Jeff Bolz 2025-02-06 00:15:30 -06:00
  • 8a7e3bf17a
    vulkan: initial support for IQ4_XS quantization (#11501) b4648 Rémy O 2025-02-06 07:09:59 +01:00
  • 4a666111d5
    Merge 0c2ff18cc8 into 1b598b3058 Rémy O 2025-02-06 08:09:17 +02:00
  • 1b598b3058
    vulkan: use smaller combined allocations to avoid fragmentation (#11551) b4647 Jeff Bolz 2025-02-06 00:02:18 -06:00
  • d797f4ac4c
    Merge 4ff0831ce6 into 902368a06b Georgi Gerganov 2025-02-05 22:53:54 -05:00
  • 5450d18371
    Merge 14f64dab74 into 902368a06b Yann Follet 2025-02-05 22:53:54 -05:00
  • 249c5bdfe7
    Merge c24778df01 into 902368a06b Yann Follet 2025-02-05 22:53:54 -05:00
  • 8fcc0b72f5
    Merge 9605c5fb28 into 902368a06b Georgi Gerganov 2025-02-05 22:53:40 -05:00
  • 6e824b0dd0
    Merge 3453401cfe into 902368a06b Milot Mirdita 2025-02-05 22:53:40 -05:00
  • 6fb2591266
    Merge 44ec40a43a into 902368a06b Herman Semenoff 2025-02-05 22:53:40 -05:00
  • c5c9676583
    Merge 3100a05ba1 into 902368a06b Herman Semenoff 2025-02-05 22:53:40 -05:00
  • 25422fc6ba
    Merge 79eac2727a into 902368a06b savesanketsw 2025-02-05 22:53:40 -05:00
  • 9500f4436a Use named LOG_COL_* colors in examples xndcn 2025-02-06 11:31:41 +08:00
  • 4f74deacea fix: free meta memory in clip model loading yushihang 2025-02-06 11:17:21 +08:00
  • 56979aebea fix: ensure proper cleanup of img_res_v.data in all code paths yushihang 2025-02-06 10:38:28 +08:00
  • 902368a06b
    metal : avoid breaking build when metal API predates TARGET_OS_VISION (#11690) b4646 Charles Duffy 2025-02-05 19:52:31 -06:00
  • c3db0480bb
    readme : add link to Autopen under UIs (#11684) Matvey Soloviev 2025-02-06 01:55:25 +01:00
  • 538f60934a ggml : fix possible underflow in ggml_nbytes slaren 2025-02-06 01:32:04 +01:00
  • a4e9e4d41b
    Update examples/server/server.cpp Eric Curtin 2025-02-05 22:59:10 +00:00
  • 8e2b1c653a
    metal : avoid breaking build when metal API predates TARGET_OS_VISION Charles Duffy 2025-02-05 16:52:43 -06:00
  • 518e077a92 remove lang from html tag Xuan Son Nguyen 2025-02-05 23:44:46 +01:00
  • 52d2d92c7e
    Merge cfb1b2277f into d774ab3acc Herman Semenoff 2025-02-05 23:08:49 +01:00
  • 61a84cf843
    Merge b83cae088c into d774ab3acc Georgi Gerganov 2025-02-05 23:00:16 +01:00
  • 82ab825ec9 add lint and format check on CI Xuan Son Nguyen 2025-02-05 22:36:00 +01:00
  • 58499a8df9 bring back thought process Xuan Son Nguyen 2025-02-05 22:29:54 +01:00
  • d23abdc3f6 When llama_chat_apply_template doesn't work Eric Curtin 2025-02-05 21:08:23 +00:00
  • 699e8e0fc7 bring back copy btn Xuan Son Nguyen 2025-02-05 21:59:12 +01:00
  • 3667a0a4a3 Add example clip cli and enhance tensor name processing in Janus converter ravenouse 2025-02-05 20:42:35 +00:00
  • 2dc24892f5
    readme : add link to Autopen under UIs Matvey Soloviev 2025-02-05 20:58:00 +01:00
  • cc27754437 fix auto scroll Xuan Son Nguyen 2025-02-05 19:06:55 +01:00
  • e90f1618db
    Merge 97a7157e11 into d774ab3acc vincent 2025-02-06 01:00:12 +08:00
  • 86083c1d7c
    Merge a7f5c74795 into d774ab3acc JohnnyB 2025-02-05 17:33:57 +01:00
  • d1a064070f revert tool example backfill change - command 7rb just needs the right template Olivier Chafik 2025-02-05 16:33:37 +00:00
  • 994301da12 use existing string_strip Olivier Chafik 2025-02-05 16:33:16 +00:00
  • 33efcb3c59 Update README.md Olivier Chafik 2025-02-05 16:20:11 +00:00
  • 0d172936be init version Xuan Son Nguyen 2025-02-05 17:20:01 +01:00
  • 098629df15 disable some failing chatml tests Olivier Chafik 2025-02-05 16:15:19 +00:00
  • 0917e0a80d fix --think arg env Olivier Chafik 2025-02-05 16:15:09 +00:00
  • a726adaef7 Copy sampler parameters from chat template Mason M 2025-02-05 12:07:22 -04:00
  • 39b50c37dc Update README.md Olivier Chafik 2025-02-05 15:53:48 +00:00
  • e6d9b52480 align Command R7B w/ --think / reasoning_content behaviour Olivier Chafik 2025-02-05 15:47:37 +00:00
  • 7700971196 vulkan: use smaller combined allocations to avoid fragmentation Jeff Bolz 2025-01-31 09:12:44 -06:00
  • 19f2ff1362 common : add default embeddings presets Daniel Bevenius 2025-02-05 15:11:09 +01:00
  • 3e08f37b08 Switch kernel selection order to dotprod and i8mm Charles Xu 2025-02-05 15:20:07 +01:00
  • 947158ee52 Specify podman works in Container documentation podman Eric Curtin 2025-02-05 13:46:03 +00:00
  • 3841a163ef fix compiler warning about parens Olivier Chafik 2025-02-05 13:05:27 +00:00
  • a30111bef3
    Merge branch 'ggerganov:master' into llamacli-tools bandoti 2025-02-05 08:48:47 -04:00
  • 2c4c6391bd
    SYCL: Adjust support condition for norm operators Akarshan Biswas 2025-02-05 18:15:31 +05:30
  • f3e9f8b62a fix test_thoughts ochafik 2025-02-05 12:34:27 +00:00
  • d20c2ce4e7 Merge branch 'r1-toolcall' of github.com:ochafik/llama.cpp into r1-toolcall ochafik 2025-02-05 12:16:42 +00:00
  • 9d7c3cc51b --think to force any model to return reasoning_content (or just parse <think> for deepseek r1) ochafik 2025-02-05 12:16:37 +00:00
  • d774ab3acc
    metal : adjust support conditions for norm operators (#11671) b4644 Georgi Gerganov 2025-02-05 10:57:42 +02:00
  • cfa2cc1e40
    Disable non-contiguous tensor support in norm kernels and add newline at the end of debug logs Akarshan Biswas 2025-02-05 13:33:46 +05:30
  • b004e0bc6b
    metal : adjust support conditions for norm operators Georgi Gerganov 2025-02-05 10:03:32 +02:00
  • fa62da9b2d
    CUDA: support for mat. mul. with ne03 != ne13 (#11656) b4643 Johannes Gäßler 2025-02-05 08:58:31 +01:00
  • 1ec208083c
    llava: add quantization for the visual projector LLAVA, Qwen2VL (#11644) b4642 SAMI 2025-02-05 14:45:40 +07:00
  • 937351835c Removed trailing whitespace sami 2025-02-05 14:24:06 +07:00
  • cdb305cd7d Fixed the gcc warning regarding minor linting sami 2025-02-05 14:11:13 +07:00
  • c03ffc3d23
    fix old glm4 models tv1wnd 2025-02-05 07:41:20 +01:00
  • 291a785587 llama : rename batch.logits to batch.output Daniel Bevenius 2024-10-22 15:03:00 +02:00
  • ef4222e9f4
    Merge branch 'master' into fix-bug-in-minicpm-v-code tc-mb 2025-02-05 11:42:12 +08:00
  • efb5773bc2
    ggml-sycl: hide matrix engine info for now from print sycl devices Akarshan Biswas 2025-02-05 09:01:25 +05:30
  • 0b602f0ecd
    Final touches Akarshan Biswas 2025-02-03 21:08:25 +05:30
  • 52b0652601
    conv: add space before eof Akarshan Biswas 2025-02-03 18:47:50 +05:30
  • e5926374a5
    Add remaining SYCL exception handler to kernel and refactor Akarshan Biswas 2025-02-03 18:44:49 +05:30
  • 7369e54b33
    Add back ggml_sycl_set_device to kernels Akarshan Biswas 2025-02-03 11:53:22 +05:30
  • 0ae9a07cf8
    ggml_sycl_op_argmax)Add debug logs to ggml_sycl_mul_ma0 Akarshan Biswas 2025-02-03 11:15:43 +05:30
  • 18d706ab0e
    gemm.hpp: remove unused include Akarshan Biswas 2025-02-03 10:38:56 +05:30
  • 539b0c662e
    ggml-sycl: sort includes Akarshan Biswas 2025-02-03 10:01:07 +05:30
  • 6eb30d9403
    Adjust EOF spaces and usused variable Akarshan Biswas 2025-02-02 19:09:23 +05:30
  • a6a239cf39
    norm: add a space at the end of file Akarshan Biswas 2025-02-02 19:03:29 +05:30
  • 6dbb7ac827
    softmax: handle SYCL exceptions and add debug logs Akarshan Biswas 2025-02-02 18:40:01 +05:30