CUDA: faster FlashAttention, kernel for bs == 1

This commit is contained in:
Johannes Gäßler 2024-03-29 23:02:39 +01:00 committed by Georgi Gerganov
parent 08e69c5008
commit 75aa7b4b18

File diff suppressed because it is too large Load diff