llama.cpp/ggml/src
Sergio Lopez 8b47fce07a vulkan: enable the use of simpler matmul shaders
Import simpler matmul shaders from the kompute backend and use them on
GPUs know to not be able to use the regular ones.

Signed-off-by: Sergio Lopez <slp@redhat.com>
2025-02-10 11:52:52 +01:00
..
ggml-blas ggml : add support for dynamic loading of backends (#10469) 2024-11-25 15:13:39 +01:00
ggml-cann llama : add Qwen2VL support + multimodal RoPE (#10361) 2024-12-14 14:43:46 +02:00
ggml-cpu ggml: Fix data race in ggml threadpool (#11736) 2025-02-08 15:30:53 +01:00
ggml-cuda CUDA: fix min. version for movmatrix (#11751) 2025-02-08 10:46:07 +01:00
ggml-hip HIP: force max threads per block to be 1024 (#11621) 2025-02-04 19:18:38 +01:00
ggml-kompute llama : add Qwen2VL support + multimodal RoPE (#10361) 2024-12-14 14:43:46 +02:00
ggml-metal metal : avoid breaking build when metal API predates TARGET_OS_VISION (#11690) 2025-02-06 09:52:31 +08:00
ggml-musa CUDA: use mma PTX instructions for FlashAttention (#11583) 2025-02-02 19:31:09 +01:00
ggml-opencl common, examples, ggml : fix MSYS2 GCC compiler errors and warnings when building with LLAMA_CURL=ON and GGML_OPENCL=ON (#11013) 2024-12-31 01:46:06 +01:00
ggml-rpc rpc: fix known RCE in rpc-server (ggml/1103) 2025-02-06 21:22:54 +02:00
ggml-sycl SYCL: remove XMX info from print devices (#11712) 2025-02-07 09:27:53 +00:00
ggml-vulkan vulkan: enable the use of simpler matmul shaders 2025-02-10 11:52:52 +01:00
CMakeLists.txt ci: use sccache on windows instead of ccache (#11545) 2025-01-31 17:12:40 +00:00
ggml-alloc.c vulkan: use smaller combined allocations to avoid fragmentation (#11551) 2025-02-06 07:02:18 +01:00
ggml-backend-impl.h rpc : early register backend devices (#11262) 2025-01-17 10:57:09 +02:00
ggml-backend-reg.cpp ggml : allow loading backend with env variable (ggml/1059) 2025-01-08 13:40:18 +02:00
ggml-backend.cpp ggml-backend : only offload from host buffers (fix) (#11124) 2025-01-07 16:11:57 +01:00
ggml-common.h CUDA: rename macros to avoid conflicts with WinAPI (#10736) 2024-12-10 18:23:24 +01:00
ggml-impl.h GGUF: C++ refactor, backend support, misc fixes (#11030) 2025-01-07 18:01:58 +01:00
ggml-opt.cpp ggml-opt: fix data corruption (ggml/1022) 2024-11-21 09:22:02 +02:00
ggml-quants.c ggml : refactor online repacking (#10446) 2024-12-07 14:37:50 +02:00
ggml-quants.h ggml : build backends as libraries (#10256) 2024-11-14 18:04:35 +01:00
ggml-threading.cpp ggml : build backends as libraries (#10256) 2024-11-14 18:04:35 +01:00
ggml-threading.h remove CMAKE_WINDOWS_EXPORT_ALL_SYMBOLS (#10797) 2024-12-12 19:02:49 +01:00
ggml.c ggml : add option to not print stack on abort (ggml/1081) 2025-01-29 11:24:53 +02:00
gguf.cpp cmake : add sanitizer flags for llama.cpp (#11279) 2025-01-18 16:18:15 +02:00