* CPU/CUDA: Gemma 2 FlashAttention support * apply logit_softcap to scale in kernel * disable logit softcapping tests on Metal * remove metal check |
||
|---|---|---|
| .. | ||
| ggml-alloc.h | ||
| ggml-backend.h | ||
| ggml-blas.h | ||
| ggml-cann.h | ||
| ggml-cuda.h | ||
| ggml-kompute.h | ||
| ggml-metal.h | ||
| ggml-rpc.h | ||
| ggml-sycl.h | ||
| ggml-vulkan.h | ||
| ggml.h | ||