This website requires JavaScript.
Explore
Help
Sign in
vbatts
/
llama.cpp
Watch
1
Star
0
Fork
You've already forked llama.cpp
0
Code
Issues
Pull requests
Projects
Releases
Packages
Wiki
Activity
Actions
2957
commits
380
branches
3056
tags
365
MiB
c3f8d58356
Commit graph
2 commits
Author
SHA1
Message
Date
Johannes Gäßler
133d99c599
CUDA: deduplicate FlashAttention code (
#7352
)
2024-05-18 12:36:25 +02:00
Johannes Gäßler
0fc1e820a9
CUDA: faster large batch FA without tensor cores (
#7314
)
2024-05-17 18:54:52 +02:00