[SYCL] fallback mmvq (#9088)

* fallback mmvq to mul_mat

* mmvq in cuda path

* Update ggml/src/ggml-sycl.cpp

Co-authored-by: Alberto Cabrera Pérez <alberto.cabrera@codeplay.com>

---------

Co-authored-by: Alberto Cabrera Pérez <alberto.cabrera@codeplay.com>
This commit is contained in:
Meng, Hengyu 2024-08-20 23:50:17 +08:00 committed by GitHub
parent 4f8d19ff17
commit 50addec9a5
No known key found for this signature in database
GPG key ID: B5690EEEBB952194
2 changed files with 3 additions and 1 deletions

View file

@ -130,6 +130,7 @@ typedef sycl::float2 dfloat2;
#endif // GGML_SYCL_F16
#define MMVQ_MAX_BATCH_SIZE 8
#define MMVQ_MIN_BATCH_SIZE 4
static const int8_t kvalues_iq4nl[16]={-127, -104, -83, -65, -49, -35, -22, -10, 1, 13, 25, 38, 53, 69, 89, 113};