CUDA: faster q2_K, q3_K MMQ + int8 tensor cores (#7921)
* CUDA: faster q2_K, q3_K MMQ + int8 tensor cores * try CI fix * try CI fix * try CI fix * fix data race * rever q2_K precision related changes
This commit is contained in:
parent
66ef1ceedf
commit
76d66ee0be
6 changed files with 468 additions and 330 deletions
File diff suppressed because it is too large
Load diff
Loading…
Add table
Add a link
Reference in a new issue