ggml : generalize quantize_fns for simpler FP16 handling (#1237)

* Generalize quantize_fns for simpler FP16 handling

* Remove call to ggml_cuda_mul_mat_get_wsize

* ci : disable FMA for mac os actions

---------

Co-authored-by: Georgi Gerganov <ggerganov@gmail.com>
This commit is contained in:
Stephan Walter 2023-07-05 16:13:06 +00:00 committed by GitHub
parent 8567c76b53
commit 1b107b8550
No known key found for this signature in database
GPG key ID: 4AEE18F83AFDEB23
9 changed files with 172 additions and 548 deletions

View file

@ -137,9 +137,10 @@ jobs:
- name: Build
id: cmake_build
run: |
sysctl -a
mkdir build
cd build
cmake -DLLAMA_AVX2=OFF ..
cmake -DLLAMA_AVX2=OFF -DLLAMA_FMA=OFF ..
cmake --build . --config Release
- name: Test