llama.cpp

History

Sergio Lopez bae9a58f5d vulkan: enable the use of simpler softmax shaders Even though the regular softmax shaders successfully pass test-backend-ops with Apple GPUs, running long inference tests has shown the models end derailing with softmax OPs being the root cause. With this commit, we use simpler softmax shaders borrowed from the Kompute backend (which are basically reimplementations of the Metal shaders) on certain GPUs know to have problem with the regular ones. Signed-off-by: Sergio Lopez <slp@redhat.com>		2025-02-10 11:52:59 +01:00
..
cmake	cmake: add ggml find package (#11369 )	2025-01-26 12:07:48 -04:00
include	vulkan: Make Vulkan optional at runtime (#11493 ). (#11494 )	2025-02-10 07:17:21 +01:00
src	vulkan: enable the use of simpler softmax shaders	2025-02-10 11:52:59 +01:00
.gitignore	vulkan : cmake integration (#8119 )	2024-07-13 18:12:39 +02:00
CMakeLists.txt	cmake: Add ability to pass in GGML_BUILD_NUMBER (ggml/1096)	2025-02-04 12:59:15 +02:00