llama.cpp

Author	SHA1	Message	Date
Georgi Gerganov	08e69c5008	cuda : adapt soft_max to F16 mask and pos	2024-03-28 19:40:11 +02:00
slaren	ae1f211ce2	cuda : refactor into multiple files (#6269 )	2024-03-25 13:50:23 +01:00