CUDA: Fixed OpenLLaMA 3b mmq, reduced compile time (#2590)
This commit is contained in:
parent
b19edd54d5
commit
f64d44a9b9
2 changed files with 587 additions and 391 deletions
976
ggml-cuda.cu
976
ggml-cuda.cu
File diff suppressed because it is too large
Load diff
Loading…
Add table
Add a link
Reference in a new issue