llama.cpp/ggml/src/ggml-vulkan/vulkan-shaders
Eve adc5dd92e8
vulkan: scale caching for k quants + misc fixes (#11081)
* q6_k scale caching

* 16 bit unpack

* q4_k test (slow)

* revert it

* q3_k

* q2_k

* little stuff

* try precalculating products of a and q2_k scales

* Revert "try precalculating products of a and q2_k scales"

This reverts commit 65110b81f23f66331a50c6e889a7c1ab9470a86b.

* unpack should be u16, add vim swap to gitignore (about time)

* better q4_k scales

* q5_k

* better q6_k with separate paths for all threads and partial threads in use, plus some more optimizations

* q2_k better dequant

* q3_k optimizations

* q3_k use hmask simd from cpu avx version

* make the caches happy

* q3_k separate out calculation

* q2_k separate out

* little stuff

* use calc_superblock everywhere

* q2_k optimize scale calculation

* more barriers
2025-01-15 19:50:13 +00:00
..
acc.comp vulkan: Use push constant offset to handle misaligned descriptors (#10987) 2024-12-29 09:35:11 +01:00
add.comp vulkan: Use push constant offset to handle misaligned descriptors (#10987) 2024-12-29 09:35:11 +01:00
argsort.comp ggml : build backends as libraries (#10256) 2024-11-14 18:04:35 +01:00
clamp.comp vulkan: Use push constant offset to handle misaligned descriptors (#10987) 2024-12-29 09:35:11 +01:00
CMakeLists.txt fix: ggml: fix vulkan-shaders-gen build (#10448) 2025-01-15 14:17:42 +01:00
concat.comp vulkan: Use push constant offset to handle misaligned descriptors (#10987) 2024-12-29 09:35:11 +01:00
contig_copy.comp vulkan: Use push constant offset to handle misaligned descriptors (#10987) 2024-12-29 09:35:11 +01:00
copy.comp vulkan: Use push constant offset to handle misaligned descriptors (#10987) 2024-12-29 09:35:11 +01:00
cos.comp vulkan: Use push constant offset to handle misaligned descriptors (#10987) 2024-12-29 09:35:11 +01:00
dequant_f32.comp ggml : build backends as libraries (#10256) 2024-11-14 18:04:35 +01:00
dequant_funcs.comp vulkan: small mul_mat_vec optimizations (#10665) 2024-12-13 09:42:04 +01:00
dequant_funcs_cm2.comp vulkan: optimize coopmat2 dequant functions (#10855) 2024-12-21 08:04:45 +01:00
dequant_head.comp ggml : build backends as libraries (#10256) 2024-11-14 18:04:35 +01:00
dequant_iq4_nl.comp vulkan: copy iq4_nl LUT into shared memory (#10409) 2024-11-20 08:40:18 +01:00
dequant_q2_k.comp ggml : build backends as libraries (#10256) 2024-11-14 18:04:35 +01:00
dequant_q3_k.comp ggml : build backends as libraries (#10256) 2024-11-14 18:04:35 +01:00
dequant_q4_0.comp ggml : build backends as libraries (#10256) 2024-11-14 18:04:35 +01:00
dequant_q4_1.comp ggml : build backends as libraries (#10256) 2024-11-14 18:04:35 +01:00
dequant_q4_k.comp Vulkan: Use improved q4_k and q5_k dequant code in dequant shaders (#10798) 2024-12-12 18:36:00 +01:00
dequant_q5_0.comp ggml : build backends as libraries (#10256) 2024-11-14 18:04:35 +01:00
dequant_q5_1.comp ggml : build backends as libraries (#10256) 2024-11-14 18:04:35 +01:00
dequant_q5_k.comp Vulkan: Use improved q4_k and q5_k dequant code in dequant shaders (#10798) 2024-12-12 18:36:00 +01:00
dequant_q6_k.comp ggml : build backends as libraries (#10256) 2024-11-14 18:04:35 +01:00
dequant_q8_0.comp ggml : build backends as libraries (#10256) 2024-11-14 18:04:35 +01:00
diag_mask_inf.comp ggml : build backends as libraries (#10256) 2024-11-14 18:04:35 +01:00
div.comp vulkan: Use push constant offset to handle misaligned descriptors (#10987) 2024-12-29 09:35:11 +01:00
flash_attn_cm2.comp vulkan: Add VK_NV_cooperative_matrix2 support for mul_mat and flash attention (#10206) 2024-12-05 20:15:05 +01:00
gelu.comp ggml : build backends as libraries (#10256) 2024-11-14 18:04:35 +01:00
gelu_quick.comp ggml : build backends as libraries (#10256) 2024-11-14 18:04:35 +01:00
generic_binary_head.comp vulkan: Use push constant offset to handle misaligned descriptors (#10987) 2024-12-29 09:35:11 +01:00
generic_head.comp ggml : build backends as libraries (#10256) 2024-11-14 18:04:35 +01:00
generic_unary_head.comp vulkan: Use push constant offset to handle misaligned descriptors (#10987) 2024-12-29 09:35:11 +01:00
get_rows.comp vulkan: Use push constant offset to handle misaligned descriptors (#10987) 2024-12-29 09:35:11 +01:00
get_rows_quant.comp vulkan: small mul_mat_vec optimizations (#10665) 2024-12-13 09:42:04 +01:00
group_norm.comp vulkan: fix group_norm (#10496) 2024-11-26 16:45:05 +01:00
im2col.comp vulkan: im2col and matmul optimizations for stable diffusion (#10942) 2024-12-29 10:16:34 +01:00
leaky_relu.comp ggml : build backends as libraries (#10256) 2024-11-14 18:04:35 +01:00
mul.comp vulkan: Use push constant offset to handle misaligned descriptors (#10987) 2024-12-29 09:35:11 +01:00
mul_mat_split_k_reduce.comp vulkan: optimize and reenable split_k (#10637) 2024-12-03 20:29:54 +01:00
mul_mat_vec.comp Vulkan: Fix float16 use on devices without float16 support + fix subgroup_size_control validation error (#11161) 2025-01-10 06:39:33 +01:00
mul_mat_vec_base.comp vulkan: optimize mul_mat for small values of N (#10991) 2024-12-30 18:27:11 +01:00
mul_mat_vec_nc.comp ggml : build backends as libraries (#10256) 2024-11-14 18:04:35 +01:00
mul_mat_vec_p021.comp ggml : build backends as libraries (#10256) 2024-11-14 18:04:35 +01:00
mul_mat_vec_q2_k.comp vulkan: scale caching for k quants + misc fixes (#11081) 2025-01-15 19:50:13 +00:00
mul_mat_vec_q3_k.comp vulkan: scale caching for k quants + misc fixes (#11081) 2025-01-15 19:50:13 +00:00
mul_mat_vec_q4_k.comp vulkan: scale caching for k quants + misc fixes (#11081) 2025-01-15 19:50:13 +00:00
mul_mat_vec_q5_k.comp vulkan: scale caching for k quants + misc fixes (#11081) 2025-01-15 19:50:13 +00:00
mul_mat_vec_q6_k.comp vulkan: scale caching for k quants + misc fixes (#11081) 2025-01-15 19:50:13 +00:00
mul_mm.comp Vulkan: VK_KHR_cooperative_matrix support to speed up prompt processing (#10597) 2024-12-07 10:24:15 +01:00
mul_mm_cm2.comp vulkan: Add VK_NV_cooperative_matrix2 support for mul_mat and flash attention (#10206) 2024-12-05 20:15:05 +01:00
norm.comp ggml : build backends as libraries (#10256) 2024-11-14 18:04:35 +01:00
pad.comp vulkan: Use push constant offset to handle misaligned descriptors (#10987) 2024-12-29 09:35:11 +01:00
pool2d.comp ggml : build backends as libraries (#10256) 2024-11-14 18:04:35 +01:00
relu.comp ggml : build backends as libraries (#10256) 2024-11-14 18:04:35 +01:00
repeat.comp vulkan: Use push constant offset to handle misaligned descriptors (#10987) 2024-12-29 09:35:11 +01:00
rms_norm.comp ggml : build backends as libraries (#10256) 2024-11-14 18:04:35 +01:00
rope_head.comp vulkan: request round-to-even for fp16 in im2col/rope_head (#10767) 2024-12-10 21:23:17 +01:00
rope_neox.comp ggml : build backends as libraries (#10256) 2024-11-14 18:04:35 +01:00
rope_norm.comp ggml : build backends as libraries (#10256) 2024-11-14 18:04:35 +01:00
scale.comp vulkan: Use push constant offset to handle misaligned descriptors (#10987) 2024-12-29 09:35:11 +01:00
silu.comp ggml : build backends as libraries (#10256) 2024-11-14 18:04:35 +01:00
sin.comp vulkan: Use push constant offset to handle misaligned descriptors (#10987) 2024-12-29 09:35:11 +01:00
soft_max.comp Vulkan: Fix float16 use on devices without float16 support + fix subgroup_size_control validation error (#11161) 2025-01-10 06:39:33 +01:00
square.comp vulkan: Use push constant offset to handle misaligned descriptors (#10987) 2024-12-29 09:35:11 +01:00
sum_rows.comp ggml : build backends as libraries (#10256) 2024-11-14 18:04:35 +01:00
tanh.comp Vulkan: fix NaN in tanh.comp with AMD proprietary driver on Windows (#10723) 2024-12-08 19:19:19 +01:00
test_coopmat2_support.comp vulkan: compile a test shader in cmake to check for coopmat2 support (#10713) 2024-12-08 09:05:55 +01:00
test_coopmat_support.comp Disable GL_KHR_cooperative_matrix Vulkan extension if not available. (#11117) 2025-01-08 09:18:13 +01:00
timestep_embedding.comp ggml : build backends as libraries (#10256) 2024-11-14 18:04:35 +01:00
types.comp Vulkan: Fix float16 use on devices without float16 support + fix subgroup_size_control validation error (#11161) 2025-01-10 06:39:33 +01:00
upscale.comp vulkan: Use push constant offset to handle misaligned descriptors (#10987) 2024-12-29 09:35:11 +01:00
vulkan-shaders-gen.cpp fix: ggml: fix vulkan-shaders-gen build (#10448) 2025-01-15 14:17:42 +01:00
wkv6.comp rwkv6: add wkv6 support for Vulkan backend (#10829) 2024-12-16 22:00:46 +01:00