[SYCL] Fix WARP_SIZE=16 bug of Intel GPU (#8266)

* fix group_norm ut

* split softmax

* fix softmax

* add concat support condition

* revert debug code

* move QK_WARP_SIZE to presets.hpp
This commit is contained in:
luoyu-intel 2024-07-05 05:06:13 +00:00 committed by GitHub
parent e235b267a2
commit a9554e20b6
No known key found for this signature in database
GPG key ID: B5690EEEBB952194
8 changed files with 301 additions and 257 deletions

View file

@ -62,4 +62,5 @@ static_assert(K_QUANTS_PER_ITERATION == 1 || K_QUANTS_PER_ITERATION == 2, "K_QUA
#define MUL_MAT_SRC1_COL_STRIDE 128
#define QK_WARP_SIZE 32
#endif // GGML_SYCL_PRESETS_HPP