CUDA: optimize and refactor MMQ (#8416)
* CUDA: optimize and refactor MMQ * explicit q8_1 memory layouts, add documentation
This commit is contained in:
parent
a977c11544
commit
808aba3916
5 changed files with 867 additions and 687 deletions
File diff suppressed because it is too large
Load diff
Loading…
Add table
Add a link
Reference in a new issue