Default branch

d7b31a9d84 · sync: minja (a72057e519) (#11774) · Updated 2025-02-10 09:34:09 +00:00

Branches

9862d59c05 · llama : change starcoder2 rope type · Updated 2024-03-01 13:10:31 +00:00    vbatts

2385
8

f0cbb6ddf6 · iq1_s: turn off SIMD implementation for QK_K = 64 (it does not work) · Updated 2024-02-28 06:28:10 +00:00    vbatts

2400
6

14d757066b · llama : add llama_kv_cache_compress (EXPERIMENTAL) · Updated 2024-02-27 14:24:40 +00:00    vbatts

2401
1

608f449880 · swift : fix build · Updated 2024-02-23 17:02:09 +00:00    vbatts

2432
4

56c047156a · py : minor fixes · Updated 2024-02-22 17:22:56 +00:00    vbatts

2441
1

5271c75666 · llama : fix K-shift with quantized K (wip) · Updated 2024-02-21 23:28:42 +00:00    vbatts

2449
1

f249c997a8 · llama : adapt to F16 KQ_pos · Updated 2024-02-19 11:31:02 +00:00    vbatts

2487
62

412735ec70 · Merge branch 'master' into gg/metal-batched · Updated 2024-02-19 09:25:24 +00:00    vbatts

2487
6

47c662b0de · fix some spaces added by IDE in math op · Updated 2024-02-18 20:40:35 +00:00    vbatts

2497
4

974e3cadff · ggml : try another fix · Updated 2024-02-17 16:14:35 +00:00    vbatts

2516
2

e856bfed3b · hf : add support for --repo and --file · Updated 2024-02-15 13:05:15 +00:00    vbatts

2530
3

ccd757a174 · convert : fix mistakes from refactoring · Updated 2024-02-13 17:01:30 +00:00    vbatts

2538
4

5c977221d2 · iq1_s: slightly faster dot product · Updated 2024-02-13 13:18:27 +00:00    vbatts

2544
15

4246b71ad7 · Fix compiler warnings (shadow variable) · Updated 2024-02-13 06:44:56 +00:00    vbatts

2547
1

7286b83d3f · BERT WIP · Updated 2024-02-06 22:10:11 +00:00    vbatts

2600
1

adcf16fd68 · py : fix empty bytes arg · Updated 2024-02-05 17:53:07 +00:00    vbatts

2610
2

91c453fb11 · One cannot possibly be defining static_assert in a C++ compilation · Updated 2024-02-05 11:22:14 +00:00    vbatts

2615
2

49a483e0f2 · wip · Updated 2024-02-04 10:34:36 +00:00    vbatts

2641
60

a647257b47 · cuda : express strides with helper constants · Updated 2024-02-04 09:45:26 +00:00    vbatts

2641
60

b957b8f5f6 · cuda : add flash_attn kernel (wip) · Updated 2024-02-01 17:49:57 +00:00    vbatts

2645
39