CUDA: refactor mmq, dmmv, mmvq (#7716)

* CUDA: refactor mmq, dmmv, mmvq

* fix out-of-bounds write

* struct for qk, qr, qi

* fix cmake build

* mmq_type_traits
This commit is contained in:
Johannes Gäßler 2024-06-05 16:53:00 +02:00 committed by GitHub
parent 2b3389677a
commit 7d1a378b8f
No known key found for this signature in database
GPG key ID: B5690EEEBB952194
112 changed files with 1783 additions and 1767 deletions

View file

@ -0,0 +1,5 @@
// This file has been autogenerated by generate_cu_files.py, do not edit manually.
#include "../mmq.cuh"
DECL_MMQ_CASE(GGML_TYPE_Q5_0);