llama.cpp

History

fxzjshm 3ec9fd4b77 HIP: force max threads per block to be 1024 (#11621 ) Some old/vendor forked version of llvm still use 256. Explicitly set it to 1024 to align with upstream llvm. Signed-off-by: fxzjshm <fxzjshm@163.com>		2025-02-04 19:18:38 +01:00
..
cmake	cmake: add ggml find package (#11369 )	2025-01-26 12:07:48 -04:00
include	CUDA: use mma PTX instructions for FlashAttention (#11583 )	2025-02-02 19:31:09 +01:00
src	HIP: force max threads per block to be 1024 (#11621 )	2025-02-04 19:18:38 +01:00
.gitignore	vulkan : cmake integration (#8119 )	2024-07-13 18:12:39 +02:00
CMakeLists.txt	cmake: Add ability to pass in GGML_BUILD_NUMBER (ggml/1096)	2025-02-04 12:59:15 +02:00