Default branch

d7b31a9d84 · sync: minja (a72057e519) (#11774) · Updated 2025-02-10 09:34:09 +00:00

Branches

956bb14595 · examples : remove --instruct remnants · Updated 2024-06-10 05:37:47 +00:00    vbatts

1562
1

d857e5192e · quantize : check imatrix for nan/inf values · Updated 2024-06-06 20:44:24 +00:00    vbatts

1580
2

731e7528be · server : fix --threads-http arg · Updated 2024-06-06 13:37:12 +00:00    vbatts

1581
1

f7d4b7c343 · build only main and server in their docker images · Updated 2024-06-05 22:13:01 +00:00    vbatts

1588
2

3d2e79da7f · add openmp lib to dockerfiles · Updated 2024-06-05 22:05:25 +00:00    vbatts

1588
1

0085f94936 · server : add /v1/completion endpoint · Updated 2024-06-04 12:58:14 +00:00    vbatts

1598
1

5f8720fb7b · add rpc-server to Makefile · Updated 2024-05-31 15:22:05 +00:00    vbatts

1633
3

956af1552a · server : update js · Updated 2024-05-31 12:47:19 +00:00    vbatts

1624
1

77c16ee0d4 · tests : disable json test due to lack of python on the CI node · Updated 2024-05-31 11:16:54 +00:00    vbatts

1637
3

d32a8f6142 · backup · Updated 2024-05-31 08:51:56 +00:00    vbatts

1634
2

8a8f8b953f · llama : print a log of the total cache size · Updated 2024-05-29 18:45:43 +00:00    vbatts

1643
4

1ca802a3e0 · parallelize fattn compilation test · Updated 2024-05-27 23:19:36 +00:00    vbatts

1669
6

ddc59e8e0a · wipwipwiwpip · Updated 2024-05-27 09:04:09 +00:00    vbatts

1691
17

4b1770109c · Fix q_xxs using mul_mat_q · Updated 2024-05-27 08:46:37 +00:00    vbatts

1674
1

1c6cde92bb · metal : disable FA kernel for HS=256 · Updated 2024-05-27 06:57:20 +00:00    vbatts

1676
1

11f78c6a2d · convert-hf : adapt ArcticModel to use yield too · Updated 2024-05-25 16:52:53 +00:00    vbatts

1683
4

dd14d818e0 · Update main-intel.Dockerfile base image to 2024.1.0 · Updated 2024-05-24 02:47:58 +00:00    vbatts

1694
1

c5fe1d6cdc · gguf-py : remove unused import · Updated 2024-05-23 04:09:49 +00:00    vbatts

1709
2

518b75260b · cuda uma test · Updated 2024-05-23 01:13:48 +00:00    vbatts

1709
1

e9095e6098 · async direct io per tensor test · Updated 2024-05-21 23:08:52 +00:00    vbatts

1728
3