Alberto Cabrera Pérez
2e82ffa4af
sycl : Fixes to broken builds and test-backend-ops ( #10257 )
...
* Fixes broken build for the SYCL CUDA backend caused by non-explicit gemm call in outprod (merged in with RWKV6 in
Optimize RWKV6 Operator Naming and Implement Multi-core CPU/ SYCL Acceleration #10133 )
* Marks permuted MUL_MAT as unsupported to be able to run test-backend-ops
* Fixes asserts in norm to fix debug builds.
2024-11-13 09:40:57 +00:00
Molly Sophia
2d5dd7bb3f
ggml : add epsilon as a parameter for group_norm ( #8818 )
...
Signed-off-by: Molly Sophia <mollysophia379@gmail.com>
2024-08-06 10:26:46 +03:00
AidanBeltonS
f4444d992c
[SYCL] Use multi_ptr to clean up deprecated warnings ( #8256 )
2024-07-10 16:10:49 +01:00
luoyu-intel
a9554e20b6
[SYCL] Fix WARP_SIZE=16 bug of Intel GPU ( #8266 )
...
* fix group_norm ut
* split softmax
* fix softmax
* add concat support condition
* revert debug code
* move QK_WARP_SIZE to presets.hpp
2024-07-05 13:06:13 +08:00
Neo Zhang Jianyu
f09b7cb609
rm get_work_group_size() by local cache for performance ( #8286 )
...
Co-authored-by: arthw <14088817+arthw@users.noreply.github.com>
2024-07-05 10:32:29 +08:00
luoyu-intel
d08c20edde
[SYCL] Fix the sub group size of Intel ( #8106 )
...
* use warp_size macro for all sycl kernels
* fix mask of permute_sub_group_by_xor
* fix rms_norm with correct warp number
* fix rms_norm_f32/group_norm_f32
* move norm to norm.cpp file
* fix quantize bug
* fix mmvq's batch size
2024-07-02 10:16:00 +08:00