CUDA: faster q2_K, q3_K MMQ + int8 tensor cores (#7921)

* CUDA: faster q2_K, q3_K MMQ + int8 tensor cores

* try CI fix

* try CI fix

* try CI fix

* fix data race

* rever q2_K precision related changes
This commit is contained in:
Johannes Gäßler 2024-06-14 18:41:49 +02:00 committed by GitHub
parent 66ef1ceedf
commit 76d66ee0be
No known key found for this signature in database
GPG key ID: B5690EEEBB952194
6 changed files with 468 additions and 330 deletions

File diff suppressed because it is too large Load diff