llama.cpp/ggml/src
Jeff Bolz aea8ddd516
vulkan: fix coopmat2 validation failures (#11284)
mul mat and flash attention shaders were loading f32 types directly into
A/B matrices, which happens to work but is technically invalid usage.
For FA, we can load it as an Accumulator matrix and convert and this
is not in the inner loop and is cheap enough. For mul mat, it's more
efficient to do this conversion in a separate pass and have the input(s)
be f16.

coopmat2 requires SPIR-V 1.6 (related using to LocalSizeId). LocalSizeId
requires maintenance4 be enabled, and SPIR-V 1.6 requires Vulkan 1.3.
2025-01-20 10:38:32 -06:00
..
ggml-blas ggml : add support for dynamic loading of backends (#10469) 2024-11-25 15:13:39 +01:00
ggml-cann llama : add Qwen2VL support + multimodal RoPE (#10361) 2024-12-14 14:43:46 +02:00
ggml-cpu vulkan: support copy from f32 to q4_0/q4_1/q5_0/q5_1/q8_0/iq4_nl (#11166) 2025-01-16 22:47:10 +01:00
ggml-cuda CUDA: backwards pass for misc. ops, add tests (#11257) 2025-01-16 16:43:38 +01:00
ggml-hip ggml : do not define GGML_USE_CUDA when building with GGML_BACKEND_DL (#11211) 2025-01-13 13:31:41 +02:00
ggml-kompute llama : add Qwen2VL support + multimodal RoPE (#10361) 2024-12-14 14:43:46 +02:00
ggml-metal ggml : do not install metal source when embed library (ggml/1054) 2025-01-04 16:09:53 +02:00
ggml-musa mtgpu: Add MUSA_DOCKER_ARCH in Dockerfiles && update cmake and make (#10516) 2024-11-26 17:00:41 +01:00
ggml-opencl common, examples, ggml : fix MSYS2 GCC compiler errors and warnings when building with LLAMA_CURL=ON and GGML_OPENCL=ON (#11013) 2024-12-31 01:46:06 +01:00
ggml-rpc rpc : code cleanup (#11107) 2025-01-07 08:37:02 +02:00
ggml-sycl SYCL: Introducing memory host pool (#11251) 2025-01-19 21:33:34 +08:00
ggml-vulkan vulkan: fix coopmat2 validation failures (#11284) 2025-01-20 10:38:32 -06:00
CMakeLists.txt GGUF: C++ refactor, backend support, misc fixes (#11030) 2025-01-07 18:01:58 +01:00
ggml-alloc.c CUDA: backwards pass for misc. ops, add tests (#11257) 2025-01-16 16:43:38 +01:00
ggml-backend-impl.h rpc : early register backend devices (#11262) 2025-01-17 10:57:09 +02:00
ggml-backend-reg.cpp ggml : allow loading backend with env variable (ggml/1059) 2025-01-08 13:40:18 +02:00
ggml-backend.cpp ggml-backend : only offload from host buffers (fix) (#11124) 2025-01-07 16:11:57 +01:00
ggml-common.h CUDA: rename macros to avoid conflicts with WinAPI (#10736) 2024-12-10 18:23:24 +01:00
ggml-impl.h GGUF: C++ refactor, backend support, misc fixes (#11030) 2025-01-07 18:01:58 +01:00
ggml-opt.cpp ggml-opt: fix data corruption (ggml/1022) 2024-11-21 09:22:02 +02:00
ggml-quants.c ggml : refactor online repacking (#10446) 2024-12-07 14:37:50 +02:00
ggml-quants.h ggml : build backends as libraries (#10256) 2024-11-14 18:04:35 +01:00
ggml-threading.cpp ggml : build backends as libraries (#10256) 2024-11-14 18:04:35 +01:00
ggml-threading.h remove CMAKE_WINDOWS_EXPORT_ALL_SYMBOLS (#10797) 2024-12-12 19:02:49 +01:00
ggml.c CUDA: backwards pass for misc. ops, add tests (#11257) 2025-01-16 16:43:38 +01:00
gguf.cpp cmake : add sanitizer flags for llama.cpp (#11279) 2025-01-18 16:18:15 +02:00