llama.cpp

History

Sergio Lopez 8b47fce07a vulkan: enable the use of simpler matmul shaders Import simpler matmul shaders from the kompute backend and use them on GPUs know to not be able to use the regular ones. Signed-off-by: Sergio Lopez <slp@redhat.com>		2025-02-10 11:52:52 +01:00
..
ggml-blas	ggml : add support for dynamic loading of backends (#10469 )	2024-11-25 15:13:39 +01:00
ggml-cann	llama : add Qwen2VL support + multimodal RoPE (#10361 )	2024-12-14 14:43:46 +02:00
ggml-cpu	ggml: Fix data race in ggml threadpool (#11736 )	2025-02-08 15:30:53 +01:00
ggml-cuda	CUDA: fix min. version for movmatrix (#11751 )	2025-02-08 10:46:07 +01:00
ggml-hip	HIP: force max threads per block to be 1024 (#11621 )	2025-02-04 19:18:38 +01:00
ggml-kompute	llama : add Qwen2VL support + multimodal RoPE (#10361 )	2024-12-14 14:43:46 +02:00
ggml-metal	metal : avoid breaking build when metal API predates TARGET_OS_VISION (#11690 )	2025-02-06 09:52:31 +08:00
ggml-musa	CUDA: use mma PTX instructions for FlashAttention (#11583 )	2025-02-02 19:31:09 +01:00
ggml-opencl	common, examples, ggml : fix MSYS2 GCC compiler errors and warnings when building with LLAMA_CURL=ON and GGML_OPENCL=ON (#11013 )	2024-12-31 01:46:06 +01:00
ggml-rpc	rpc: fix known RCE in rpc-server (ggml/1103)	2025-02-06 21:22:54 +02:00
ggml-sycl	SYCL: remove XMX info from print devices (#11712 )	2025-02-07 09:27:53 +00:00
ggml-vulkan	vulkan: enable the use of simpler matmul shaders	2025-02-10 11:52:52 +01:00
CMakeLists.txt	`ci`: use sccache on windows instead of ccache (#11545 )	2025-01-31 17:12:40 +00:00
ggml-alloc.c	vulkan: use smaller combined allocations to avoid fragmentation (#11551 )	2025-02-06 07:02:18 +01:00
ggml-backend-impl.h	rpc : early register backend devices (#11262 )	2025-01-17 10:57:09 +02:00
ggml-backend-reg.cpp	ggml : allow loading backend with env variable (ggml/1059)	2025-01-08 13:40:18 +02:00
ggml-backend.cpp	ggml-backend : only offload from host buffers (fix) (#11124 )	2025-01-07 16:11:57 +01:00
ggml-common.h	CUDA: rename macros to avoid conflicts with WinAPI (#10736 )	2024-12-10 18:23:24 +01:00
ggml-impl.h	GGUF: C++ refactor, backend support, misc fixes (#11030 )	2025-01-07 18:01:58 +01:00
ggml-opt.cpp	ggml-opt: fix data corruption (ggml/1022)	2024-11-21 09:22:02 +02:00
ggml-quants.c	ggml : refactor online repacking (#10446 )	2024-12-07 14:37:50 +02:00
ggml-quants.h	ggml : build backends as libraries (#10256 )	2024-11-14 18:04:35 +01:00
ggml-threading.cpp	ggml : build backends as libraries (#10256 )	2024-11-14 18:04:35 +01:00
ggml-threading.h	remove CMAKE_WINDOWS_EXPORT_ALL_SYMBOLS (#10797 )	2024-12-12 19:02:49 +01:00
ggml.c	ggml : add option to not print stack on abort (ggml/1081)	2025-01-29 11:24:53 +02:00
gguf.cpp	cmake : add sanitizer flags for llama.cpp (#11279 )	2025-01-18 16:18:15 +02:00