[SYCL] refactor (#6408)

* seperate lower precision GEMM from the main files * fix workgroup size hardcode
2024-06-19 09:11:51 +08:00 · 2024-06-19 09:11:51 +08:00 · 623494a478
commit 623494a478
parent 37bef89433
13 changed files with 7600 additions and 6997 deletions
--- a/ggml-sycl/presets.hpp
+++ b/ggml-sycl/presets.hpp
@ -18,8 +18,6 @@
 #define GGML_SYCL_MAX_DEVICES       48
 #define GGML_SYCL_NAME "SYCL"

-// FIXME: 1024 from cuda
-#define GROUP_SIZE 1024
 #define WARP_SIZE 32
 #define MATRIX_ROW_PADDING 512 // last row of quant. matrices is a multiple of this to avoid out-of-bounds memory accesses