Commit graph

  • bf3ca6bd42 vulkan: implement dequantize variants for coopmat2 Rémy O 2025-01-19 16:05:47 +01:00
  • 5fa89da950 vulkan: optimize Q3_K by removing branches Rémy O 2025-01-19 11:52:19 +01:00
  • 055e4287f7 vulkan: initial support for IQ2_XS Rémy O 2025-01-19 11:51:27 +01:00
  • 953b47ea18 vulkan: initial support for IQ2_XXS Rémy O 2025-01-19 11:50:13 +01:00
  • a0dae0b1f9 vulkan: initial support for IQ3_XXS Rémy O 2025-01-19 11:46:58 +01:00
  • 1843136445 vulkan: initial support for IQ3_S Rémy O 2025-01-19 11:44:45 +01:00
  • 2ee90d3b9f rpc: fix register position thxCode 2025-01-26 21:54:20 +08:00
  • ce730637e8 llama : Update tensor names in DeepSeek2 MLA implementation. Stanisław Szymczyk 2025-01-26 12:50:17 +01:00
  • c23325e550
    Update examples/simple-cmake-pkg/README.md bandoti 2025-01-26 09:09:57 -04:00
  • 7f3c223988
    Update examples/simple-cmake-pkg/README.md bandoti 2025-01-26 09:09:35 -04:00
  • 2cc9b8c32c
    readme : update hot topics Georgi Gerganov 2025-01-26 14:30:15 +02:00
  • 6da9021ab4
    examples : add idle tool for investigating GPU idle overhead Georgi Gerganov 2024-11-01 09:14:54 +02:00
  • 1099ef271e 9b hf chat support liyuhang 2025-01-26 15:17:14 +08:00
  • 9f5d80923e fix template err liyuhang 2025-01-26 14:05:25 +08:00
  • f077b03bc3 fix format err liyuhang 2025-01-26 12:57:04 +08:00
  • 593cc8653d fix ci err liyuhang 2025-01-26 12:53:59 +08:00
  • d9db0929b5 fix confict liyuhang 2025-01-26 12:47:18 +08:00
  • 8cb12d43d6 remove llguidance.h from .gitignore Michal Moskal 2025-01-25 20:45:59 -08:00
  • 2a92bfbe06 code style fixes Michal Moskal 2025-01-25 20:43:33 -08:00
  • adc4aed0af clarify docs Michal Moskal 2025-01-25 20:35:41 -08:00
  • b5399d44c2 add some docs Michal Moskal 2025-01-25 20:27:07 -08:00
  • 86bce2b6d3 merge liyuhang 2025-01-26 10:40:06 +08:00
  • f35726c2fb
    build: apply MSVC /bigobj option to c/cpp files only (#11423) b4557 Jeff Bolz 2025-01-25 20:10:03 -06:00
  • d637812d36 build: apply MSVC /bigobj option to c/cpp files only Jeff Bolz 2025-01-25 19:51:39 -06:00
  • cbf779c450 Temporarily add logging of free device memory at the end of main Nikita Sarychev 2025-01-25 17:17:21 -08:00
  • afb6cac5ab use '%llguidance' as marker to enable llg lark syntax Michal Moskal 2025-01-25 16:57:28 -08:00
  • 7088822e5a Merge remote-tracking branch 'upstream/master' into Remove_obsolete_HIP_workaround Nikita Sarychev 2025-01-25 16:46:43 -08:00
  • f4dc4b89fa build: integrate llguidance as an external project Michal Moskal 2025-01-25 15:49:23 -08:00
  • f19655c4c0 update for new APIs Michal Moskal 2025-01-25 15:49:07 -08:00
  • f7ac792442
    Merge branch 'ggerganov:master' into master Jianlin Shi 2025-01-25 16:45:20 -07:00
  • 76290d9ea0 initial porting of previous LLG patch Michal Moskal 2025-01-25 14:43:57 -08:00
  • e5ae4802db
    docker: add missing vulkan library to base layer and update to 24.04 rare-magma 2025-01-25 22:34:36 +01:00
  • 4a75d19376
    vulkan: compile shaders on-demand (#11406) Jeff Bolz 2025-01-25 15:29:57 -06:00
  • 26771a1491
    Hip: disable VMM on hip as it seams that it dosent work in some configurations (#11420) uvos 2025-01-25 21:01:12 +01:00
  • de538aa329 llama : optimize DeepSeek MLA implementation Stanisław Szymczyk 2025-01-25 18:10:22 +01:00
  • a9f166a7df vulkan: compile shaders on-demand Jeff Bolz 2025-01-24 22:08:46 -06:00
  • ca6baf76c1
    build: add /bigobj to MSVC build (#11407) Jeff Bolz 2025-01-25 11:26:37 -06:00
  • 3703119ee0 Hip: disable VMM on hip as it seams that it dosent work in some configurations uvos 2025-01-25 17:18:12 +01:00
  • 6e264a905b
    docker : add GGML_CPU_ARM_ARCH arg to select ARM architecture to build for (#11419) Diego Devesa 2025-01-25 17:22:41 +01:00
  • 49b0e3cec4
    server : fix cleaning up stream task (#11418) b4552 Xuan Son Nguyen 2025-01-25 16:36:44 +01:00
  • 172fe7f347 docker : add GGML_CPU_ARM_ARCH arg to select ARM architecture to build for slaren 2025-01-25 16:25:02 +01:00
  • 157bd11e69 one more spot Xuan Son Nguyen 2025-01-25 16:23:25 +01:00
  • 8067639cb0 server : fix cleaning up stream task Xuan Son Nguyen 2025-01-25 16:22:11 +01:00
  • 90eefc2ba4 refactor minicpm-v support Xuan Son Nguyen 2025-01-25 15:52:54 +01:00
  • 20a758155b
    docker : fix CPU ARM build (#11403) Diego Devesa 2025-01-25 15:22:29 +01:00
  • 6fe22643a8 add CURL to other builds slaren 2025-01-25 14:31:23 +01:00
  • 00c24acb2a
    ci : fix line breaks on windows builds (#11409) b4550 Georgi Gerganov 2025-01-25 13:36:48 +02:00
  • 0959cc18ee Merge branch 'master' into xsn/vision_2 Xuan Son Nguyen 2025-01-25 12:16:34 +01:00
  • 6f736127c6
    ci : fix powershell line breaks Georgi Gerganov 2025-01-25 12:32:07 +02:00
  • 044d4998ae Llama-bench: allow benchmarking lora impact uvos 2025-01-25 11:09:26 +01:00
  • c5769b5c05
    cont : another try Georgi Gerganov 2025-01-25 11:31:33 +02:00
  • 4a11c34f81
    ci : fix line breaks on windows builds Georgi Gerganov 2025-01-25 11:16:03 +02:00
  • 9c776df90a build: add /bigobj to MSVC build Jeff Bolz 2025-01-24 23:47:02 -06:00
  • 51b7aab841 Update test_chat_completion.py Olivier Chafik 2025-01-25 04:57:40 +00:00
  • a6463c1e35 jinja: don't add bos when jinja enabled Olivier Chafik 2025-01-25 04:52:42 +00:00
  • 0208b20767 Update test_chat_completion.py Olivier Chafik 2025-01-25 04:52:03 +00:00
  • c479d39abd tool-call: allow special tokens that are grammar triggers Olivier Chafik 2025-01-25 04:51:53 +00:00
  • 6473d646d5 docker : fix CPU ARM build slaren 2025-01-25 01:19:50 +01:00
  • 466ea66f33
    CANN: Add Ascend CANN build ci (#10217) b4549 jiahao su 2025-01-25 07:26:01 +08:00
  • 5f0db9522f
    hip : Add hipGraph and VMM support to ROCM (#11362) b4548 uvos 2025-01-25 00:02:23 +01:00
  • de9d2c6f09 test [pack] sl/pr-releases slaren 2025-01-24 22:07:27 +01:00
  • c1973cf687
    Handle missing model in CLI parameters for llama-run Michael Engel 2025-01-24 21:34:19 +01:00
  • df0edbb0be test slaren 2025-01-24 22:03:31 +01:00
  • 202b1e7105 ci : allow creating artifacts on PRs on demand slaren 2025-01-24 21:36:11 +01:00
  • 894b489ada Enable VMM on rocm uvos 2025-01-23 22:33:53 +01:00
  • 580b619a07 Add hipGraph support uvos 2025-01-22 23:09:41 +01:00
  • c5d9effb49
    CUDA: fix FP16 cuBLAS GEMM (#11396) b4547 Johannes Gäßler 2025-01-24 21:02:43 +01:00
  • f07c2ec505 llama : add option to override tensor buffers slaren 2025-01-24 20:56:09 +01:00
  • 835e04f590 Removed unnecessary bracket from status message Mason M 2025-01-24 15:38:19 -04:00
  • 8aa0338e16 CUDA: fix FP16 cuBLAS GEMM Johannes Gäßler 2025-01-24 20:35:02 +01:00
  • 6f9a84315f docs: build cuda update Tei Home 2025-01-25 02:32:49 +08:00
  • 448ce6a864 docs: update fedora cuda guide for 12.8 release Tei Home 2025-01-24 23:54:57 +08:00
  • bf444eeb6b Fix typo Mason M 2025-01-24 14:59:24 -04:00
  • 6388dd9cb2 Interface features require c_std_90 Mason M 2025-01-24 13:09:59 -04:00
  • 9fbadaef4f
    rocBLAS: Avoid fp32->fp16->fp32 conversion on cdna (#11356) b4546 uvos 2025-01-24 17:50:49 +01:00
  • 65b0d8ba4a Replace main-cmake-pkg with simple-cmake-pkg Mason M 2025-01-24 12:46:57 -04:00
  • 9755129c27
    release : pack /lib in the packages (#11392) b4545 Georgi Gerganov 2025-01-24 18:41:30 +02:00
  • 9d9ac6aaf3 Avoid fp32->fp16->fp32 conversion on cdna in ggml_cuda_op_mul_mat_cublas uvos 2025-01-22 19:07:13 +01:00
  • 969b264657
    Revert "TMP : push artifacts" gg/build-pack-lib-include Georgi Gerganov 2025-01-24 17:58:09 +02:00
  • 5740ec7a66
    ci : change ubuntu package to 22.04 Georgi Gerganov 2025-01-24 17:05:44 +02:00
  • 872fd18420
    ci : fix typo Georgi Gerganov 2025-01-24 16:46:03 +02:00
  • 39d0621872
    ci : macos set build rpath to "@loader_path" Georgi Gerganov 2025-01-24 16:31:12 +02:00
  • dae44bf21a
    ci : change back to ubuntu latest Georgi Gerganov 2025-01-24 16:28:32 +02:00
  • 537b09e70f
    TMP : push artifacts Georgi Gerganov 2025-01-24 14:54:24 +02:00
  • 8b2ed1e432
    ci : remove obsolete MacOS build Georgi Gerganov 2025-01-24 16:01:52 +02:00
  • f9f65f0162
    ci : try to fix macos build rpaths Georgi Gerganov 2025-01-24 16:01:32 +02:00
  • 56e26a7f30
    ci : change ubuntu build from latest to 20.04 Georgi Gerganov 2025-01-24 15:58:48 +02:00
  • 194358e3b7
    ci : restore the original HIP commands Georgi Gerganov 2025-01-24 15:41:52 +02:00
  • a07c2c8a52
    docs : Update readme to build targets for local docker build (#11368) Jafar Uruç 2025-01-24 13:30:13 +00:00
  • 50455ded31
    ci : fix HIP cmake compiler options to be on first line Georgi Gerganov 2025-01-24 15:23:22 +02:00
  • 564353c9a3
    Revert "TMP : push artifacts" Georgi Gerganov 2025-01-24 15:22:36 +02:00
  • 4decf2c4df
    TMP : push artifacts Georgi Gerganov 2025-01-24 14:54:24 +02:00
  • f0ce53f158 Merge remote-tracking branch 'origin/master' into deepseek2-mla-exp Stanisław Szymczyk 2025-01-24 13:49:30 +01:00
  • 3a35bfe1f7
    cmake : put libs in /bin Georgi Gerganov 2025-01-24 14:40:48 +02:00
  • 8137b4bb2b
    CPU/CUDA: fix (GQA) mul mat back, add CUDA support (#11380) b4543 Johannes Gäßler 2025-01-24 12:38:31 +01:00
  • ff4cb6ef4c
    release : pack /lib and /include in the packages gg/build-linux-static Georgi Gerganov 2025-01-24 13:28:37 +02:00
  • 1af6945eb0
    cmake : avoid -march=native when reproducible build is wanted (#11366) b4542 Bernhard M. Wiedemann 2025-01-24 12:21:35 +01:00
  • 9a391b98a5 Avoid -march=native when reproducible build is wanted Bernhard M. Wiedemann 2025-01-23 10:20:13 +01:00
  • ae4cca3e7b CPU/CUDA: fix (GQA) mul mat back, add CUDA support Johannes Gäßler 2025-01-23 22:37:56 +01:00
  • 01f37edf1a
    Update llama-run README.md (#11386) Eric Curtin 2025-01-24 09:39:24 +00:00