llama.cpp

History

Dongyan Qian 2cf0d9ef3b ggml-cpu-aarch64: Fix compilation issues In function 'block_q4_0x4 make_block_q4_0x4(block_q4_0, unsigned int)', inlined from 'int repack_q4_0_to_q4_0_4_bl(ggml_tensor, int, const void, size_t)' at ggml-cpu-aarch64.cpp:3614:19: warning: writing 32 bytes into a region of size 0 [-Wstringop-overflow=] 3614 \| memcpy(&out.qs[dst_offset], &elems, sizeof(uint64_t)); \| ~~~~~~^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ ggml-cpu-aarch64.cpp: In function 'int repack_q4_0_to_q4_0_4_bl(ggml_tensor, int, const void, size_t)': ggml-cpu-aarch64.cpp:3685:20: note: at offset 72 into destination object '<anonymous>' of size 72 3685 \| dst++ = make_block_q4_0x4(dst_tmp, interleave_block); \| ~~~~~~~^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ Signed-off-by: Dongyan Qian <dongyan0314@gmail.com>		2025-02-08 10:05:32 +08:00
..
cmake	cmake: add ggml find package (#11369 )	2025-01-26 12:07:48 -04:00
include	CUDA: use mma PTX instructions for FlashAttention (#11583 )	2025-02-02 19:31:09 +01:00
src	ggml-cpu-aarch64: Fix compilation issues	2025-02-08 10:05:32 +08:00
.gitignore	vulkan : cmake integration (#8119 )	2024-07-13 18:12:39 +02:00
CMakeLists.txt	cmake: Add ability to pass in GGML_BUILD_NUMBER (ggml/1096)	2025-02-04 12:59:15 +02:00