Q6_K AVX improvements (#10118)

* q6_k instruction reordering attempt * better subtract method * should be theoretically faster small improvement with shuffle lut, likely because all loads are already done at that stage * optimize bit fiddling * handle -32 offset separately. bsums exists for a reason! * use shift * Update ggml-quants.c * have to update ci macos version to 13 as 12 doesnt work now. 13 is still x86
2024-11-04 22:06:31 +00:00 · 2024-11-04 22:06:31 +00:00 · 3407364776
commit 3407364776
parent d5a409e57f
2 changed files with 38 additions and 51 deletions
--- a/.github/workflows/build.yml
+++ b/.github/workflows/build.yml
@ -92,7 +92,7 @@ jobs:
          name: llama-bin-macos-arm64.zip

  macOS-latest-cmake-x64:
-    runs-on: macos-12
+    runs-on: macos-13

    steps:
      - name: Clone