llama.cpp

History

Jeff Bolz aea8ddd516 vulkan: fix coopmat2 validation failures (#11284 ) mul mat and flash attention shaders were loading f32 types directly into A/B matrices, which happens to work but is technically invalid usage. For FA, we can load it as an Accumulator matrix and convert and this is not in the inner loop and is cheap enough. For mul mat, it's more efficient to do this conversion in a separate pass and have the input(s) be f16. coopmat2 requires SPIR-V 1.6 (related using to LocalSizeId). LocalSizeId requires maintenance4 be enabled, and SPIR-V 1.6 requires Vulkan 1.3.		2025-01-20 10:38:32 -06:00
..
ggml-blas	ggml : add support for dynamic loading of backends (#10469 )	2024-11-25 15:13:39 +01:00
ggml-cann	llama : add Qwen2VL support + multimodal RoPE (#10361 )	2024-12-14 14:43:46 +02:00
ggml-cpu	vulkan: support copy from f32 to q4_0/q4_1/q5_0/q5_1/q8_0/iq4_nl (#11166 )	2025-01-16 22:47:10 +01:00
ggml-cuda	CUDA: backwards pass for misc. ops, add tests (#11257 )	2025-01-16 16:43:38 +01:00
ggml-hip	ggml : do not define GGML_USE_CUDA when building with GGML_BACKEND_DL (#11211 )	2025-01-13 13:31:41 +02:00
ggml-kompute	llama : add Qwen2VL support + multimodal RoPE (#10361 )	2024-12-14 14:43:46 +02:00
ggml-metal	ggml : do not install metal source when embed library (ggml/1054)	2025-01-04 16:09:53 +02:00
ggml-musa	mtgpu: Add MUSA_DOCKER_ARCH in Dockerfiles && update cmake and make (#10516 )	2024-11-26 17:00:41 +01:00
ggml-opencl	common, examples, ggml : fix MSYS2 GCC compiler errors and warnings when building with LLAMA_CURL=ON and GGML_OPENCL=ON (#11013 )	2024-12-31 01:46:06 +01:00
ggml-rpc	rpc : code cleanup (#11107 )	2025-01-07 08:37:02 +02:00
ggml-sycl	SYCL: Introducing memory host pool (#11251 )	2025-01-19 21:33:34 +08:00
ggml-vulkan	vulkan: fix coopmat2 validation failures (#11284 )	2025-01-20 10:38:32 -06:00
CMakeLists.txt	GGUF: C++ refactor, backend support, misc fixes (#11030 )	2025-01-07 18:01:58 +01:00
ggml-alloc.c	CUDA: backwards pass for misc. ops, add tests (#11257 )	2025-01-16 16:43:38 +01:00
ggml-backend-impl.h	rpc : early register backend devices (#11262 )	2025-01-17 10:57:09 +02:00
ggml-backend-reg.cpp	ggml : allow loading backend with env variable (ggml/1059)	2025-01-08 13:40:18 +02:00
ggml-backend.cpp	ggml-backend : only offload from host buffers (fix) (#11124 )	2025-01-07 16:11:57 +01:00
ggml-common.h	CUDA: rename macros to avoid conflicts with WinAPI (#10736 )	2024-12-10 18:23:24 +01:00
ggml-impl.h	GGUF: C++ refactor, backend support, misc fixes (#11030 )	2025-01-07 18:01:58 +01:00
ggml-opt.cpp	ggml-opt: fix data corruption (ggml/1022)	2024-11-21 09:22:02 +02:00
ggml-quants.c	ggml : refactor online repacking (#10446 )	2024-12-07 14:37:50 +02:00
ggml-quants.h	ggml : build backends as libraries (#10256 )	2024-11-14 18:04:35 +01:00
ggml-threading.cpp	ggml : build backends as libraries (#10256 )	2024-11-14 18:04:35 +01:00
ggml-threading.h	remove CMAKE_WINDOWS_EXPORT_ALL_SYMBOLS (#10797 )	2024-12-12 19:02:49 +01:00
ggml.c	CUDA: backwards pass for misc. ops, add tests (#11257 )	2025-01-16 16:43:38 +01:00
gguf.cpp	cmake : add sanitizer flags for llama.cpp (#11279 )	2025-01-18 16:18:15 +02:00