[SYCL] Fix the sub group size of Intel (#8106)

* use warp_size macro for all sycl kernels

* fix mask of permute_sub_group_by_xor

* fix rms_norm with correct warp number

* fix rms_norm_f32/group_norm_f32

* move norm to norm.cpp file

* fix quantize bug

* fix mmvq's batch size
This commit is contained in:
luoyu-intel 2024-07-02 02:16:00 +00:00 committed by GitHub
parent 5fac350b9c
commit d08c20edde
No known key found for this signature in database
GPG key ID: B5690EEEBB952194
9 changed files with 587 additions and 509 deletions

View file

@ -16,7 +16,7 @@
#define GGML_SYCL_MAX_STREAMS 8
#define GGML_SYCL_MAX_BUFFERS 256
#define WARP_SIZE 32
#define WARP_SIZE GGML_SYCL_WARP_SIZE
#define MATRIX_ROW_PADDING 512 // last row of quant. matrices is a multiple of this to avoid out-of-bounds memory accesses
#define SYCL_GELU_BLOCK_SIZE 256