llama.cpp

History

Jeff Bolz 716bd6dec3 vulkan: optimize mul_mat for small values of N (#10991 ) Make the mul_mat_vec shaders support N>1 (as a spec constant, NUM_COLS) where the batch_strides are overloaded to hold the row strides. Put the loads from the B matrix in the innermost loop because it should cache better. Share some code for reducing the result values to memory in mul_mat_vec_base.		2024-12-30 18:27:11 +01:00
..
include	tts : add OuteTTS support (#10784 )	2024-12-18 19:27:21 +02:00
src	vulkan: optimize mul_mat for small values of N (#10991 )	2024-12-30 18:27:11 +01:00
.gitignore	vulkan : cmake integration (#8119 )	2024-07-13 18:12:39 +02:00
CMakeLists.txt	ggml : fix arm build (#10890 )	2024-12-18 23:21:42 +01:00