[SYCL] Fix the sub group size of Intel (#8106)
* use warp_size macro for all sycl kernels * fix mask of permute_sub_group_by_xor * fix rms_norm with correct warp number * fix rms_norm_f32/group_norm_f32 * move norm to norm.cpp file * fix quantize bug * fix mmvq's batch size
This commit is contained in:
parent
5fac350b9c
commit
d08c20edde
9 changed files with 587 additions and 509 deletions
|
@ -16,7 +16,7 @@
|
|||
#define GGML_SYCL_MAX_STREAMS 8
|
||||
#define GGML_SYCL_MAX_BUFFERS 256
|
||||
|
||||
#define WARP_SIZE 32
|
||||
#define WARP_SIZE GGML_SYCL_WARP_SIZE
|
||||
#define MATRIX_ROW_PADDING 512 // last row of quant. matrices is a multiple of this to avoid out-of-bounds memory accesses
|
||||
|
||||
#define SYCL_GELU_BLOCK_SIZE 256
|
||||
|
|
Loading…
Add table
Add a link
Reference in a new issue