Default branch

d7b31a9d84 · sync: minja (a72057e519) (#11774) · Updated 2025-02-10 09:34:09 +00:00

Branches

6494509801 · backup · Updated 2024-08-26 08:58:54 +00:00    vbatts

1064
2

ccb45186d0 · docs : remove references · Updated 2024-08-26 06:52:17 +00:00    vbatts

1058
2

8062650343 · llama : fix simple splits when the batch contains embeddings · Updated 2024-08-21 19:09:03 +00:00    vbatts

1069
19

9127800d83 · wip · Updated 2024-08-16 23:51:06 +00:00    vbatts

1102
2

62d7b6c87f · cuda : re-add q4_0 · Updated 2024-08-14 10:37:03 +00:00    vbatts

1098
3

93ec58b932 · server : fix typo in comment · Updated 2024-08-14 02:12:26 +00:00    vbatts

1100
4

faaac59d16 · llama : support NUL bytes in tokens · Updated 2024-08-12 01:00:03 +00:00    vbatts

1111
1

73bc9350cd · gguf-py : Numpy dequantization for grid-based i-quants · Updated 2024-08-10 03:47:31 +00:00    vbatts

1131
2

9329953a61 · llama : avoid double tensor copy when saving session to buffer · Updated 2024-08-07 20:03:34 +00:00    vbatts

1139
2

7764ab911d · update guide · Updated 2024-08-07 14:01:02 +00:00    vbatts

1140
1

cad8abb49b · add tool to allow plotting tensor allocation maps within buffers · Updated 2024-08-06 20:09:51 +00:00    vbatts

1148
1

6e299132e7 · clip : style changes · Updated 2024-08-06 08:44:29 +00:00    vbatts

1472
56

16dab13bde · correct cmd name · Updated 2024-08-05 16:15:33 +00:00    vbatts

1157
1

bddcc5f985 · llama : better replace_all · Updated 2024-08-04 10:42:08 +00:00    vbatts

1173
1

229c35cb59 · gguf-py : remove LlamaFileTypeMap · Updated 2024-08-04 01:22:37 +00:00    vbatts

1176
5

eab4a88210 · Using dp4a ptx intrinsics for an improved Mul8MAT perf [By Alcpz] · Updated 2024-07-29 15:52:29 +00:00    vbatts

1194
1

9cddd9aeec · llama : cast seq_id in comparison with unsigned n_seq_max · Updated 2024-07-27 19:50:23 +00:00    vbatts

1232
7

9aeb0e1f75 · sycl add conv support · Updated 2024-07-25 12:15:02 +00:00    vbatts

1221
1
vbatts-finetune
Some checks failed
flake8 Lint / Lint (push) Has been cancelled

4b0eff3df5 · docs : Quantum -> Quantized (#8666) · Updated 2024-07-25 08:13:27 +00:00    vbatts

1224
0
Included

5934580905 · ggml : add and use ggml_cpu_has_llamafile() · Updated 2024-07-24 08:31:41 +00:00    vbatts

1232
1