Commit graph

2314 commits

Author SHA1 Message Date
Pierrick HYMBERT
d159e29d4b server: test: ci fix openblas build 2024-02-23 11:34:38 +01:00
Pierrick HYMBERT
606738eeef server: test: ci fix clblast 2024-02-23 11:32:25 +01:00
Pierrick HYMBERT
fa51baca9a server: test: ci fix matrix 2024-02-23 11:30:24 +01:00
Pierrick HYMBERT
2edd995f2a server: test: ci fix cublas build 2024-02-23 11:27:19 +01:00
Pierrick HYMBERT
e4fb790077 server: test: ci fix cuda build 2024-02-23 11:19:49 +01:00
Pierrick HYMBERT
fce2e00023 server: tests: ci : fix cuda install 2024-02-23 10:58:59 +01:00
Pierrick HYMBERT
334902b13e server: tests: ci : fix step id duplicated 2024-02-23 10:56:07 +01:00
Pierrick HYMBERT
86896aadd0 server: tests: ci : continue on error 2024-02-23 10:53:46 +01:00
Pierrick HYMBERT
68cd1a4c16 server: tests: ci : matrix cuda 2024-02-23 10:46:17 +01:00
Pierrick HYMBERT
12bb797193 server: tests: ci : add git 2024-02-23 10:41:41 +01:00
Pierrick HYMBERT
29f8833058 server: tests: ci : fix wget missing 2024-02-23 10:39:45 +01:00
Pierrick HYMBERT
0b0f0565dd server: tests: ci : build and run tests for all matrix defines, sanitizer and type 2024-02-23 10:33:21 +01:00
Pierrick HYMBERT
36ddb962d8 server: tests: parallel fix server is started twice, add colors to help to monitor in the CI jobs 2024-02-23 10:09:19 +01:00
Pierrick HYMBERT
530d3ae4c4 server: tests: reducing sleep time during scenario 2024-02-23 02:38:54 +01:00
Pierrick HYMBERT
bedf37c9d1 server: tests: reducing n_ctx and n_predict for // prompts as it is too slow in the CI. 2024-02-23 02:38:37 +01:00
Pierrick HYMBERT
5110de08e3 server: tests: fix coloring console 2024-02-23 02:31:44 +01:00
Pierrick HYMBERT
6bba3be151 server: tests: ci adding psmisc as it is not present by default in ubuntu base killall 2024-02-23 02:31:30 +01:00
Pierrick HYMBERT
6e71126c12 server: tests: ci adding curl as it is not present by default in ubuntu base for the hf.sh script 2024-02-23 02:19:47 +01:00
Pierrick HYMBERT
d0e0050843 server: tests: ci adding python3-pip as it is not present by default in ubuntu base 2024-02-23 02:16:56 +01:00
Pierrick HYMBERT
2bb4732c01 server: tests: ci adding cmake as it is not present by default in ubuntu base 2024-02-23 02:13:30 +01:00
Pierrick HYMBERT
6a215e5359 server: tests: ci adding container to specify server port and allow the server to listen to 2024-02-23 02:11:56 +01:00
Pierrick HYMBERT
2f756f84df server: tests: allow to override the server port before launching tests 2024-02-23 01:59:29 +01:00
Pierrick HYMBERT
70e90558ae server: tests: add log in server start to identify why the server does not listen on the CI 2024-02-23 01:46:08 +01:00
Pierrick HYMBERT
b38b9e60a1 server: tests: minor fix server --alias param passed twice 2024-02-23 01:31:56 +01:00
Pierrick HYMBERT
14b6ede152 server: tests: minor color change 2024-02-23 01:29:39 +01:00
Pierrick HYMBERT
1bd07e56c4 server: tests: assert embeddings are actually computed, make the embeddings endpoint configurable.
Add logs to investigate why the CI server test job is not starting
2024-02-23 01:25:08 +01:00
Pierrick HYMBERT
cba6d4ea17 server: tests: minor fix missing param. 2024-02-23 00:54:44 +01:00
Pierrick HYMBERT
51f527440a server: tests: ci triggered on any changes on server example path 2024-02-23 00:37:42 +01:00
Pierrick HYMBERT
26b66c5496 server: tests: Fix some random behavior where the wait for busy status is missing 2024-02-22 23:38:47 +01:00
Pierrick HYMBERT
aa591ef12d server: tests: add Multi users with total number of tokens to predict exceeds the KV Cache size 2024-02-22 23:37:56 +01:00
Pierrick HYMBERT
f820e10fa7 server: tests: ci ensure the server is stopped before scenario, and do not quit while the server is listening 2024-02-22 23:18:42 +01:00
Pierrick HYMBERT
8b96bdaf08 Merge remote-tracking branch 'origin/master' into test/server-add-ci-test 2024-02-22 22:11:36 +01:00
Pierrick HYMBERT
597c181abb server: tests: ci do not take a model anymore, fix trigger patch 2024-02-22 21:58:28 +01:00
Pierrick HYMBERT
e43406e36d server: tests: switch to asyncio for concurrent tests, match result content with regex 2024-02-22 21:55:40 +01:00
Pierrick HYMBERT
016b221549 server: fix health/slots endpoint slot state access available race condition 2024-02-22 21:55:18 +01:00
Someone
201294ae17
nix: init singularity and docker images (#5056)
Exposes a few attributes demonstrating how to build [singularity](https://docs.sylabs.io/guides/latest/user-guide/)/[apptainer](https://apptainer.org/) and Docker images re-using llama.cpp's Nix expression.

Built locally on `x86_64-linux` with `nix build github:someoneserge/llama.cpp/feat/nix/images#llamaPackages.{docker,docker-min,sif,llama-cpp}` and it's fast and effective.
2024-02-22 11:44:10 -08:00
Georgi Gerganov
5a9e2f60ba
py : minor fixes (#5668) 2024-02-22 20:13:25 +02:00
Xuan Son Nguyen
373ee3fbba
Add Gemma chat template (#5665)
* add gemma chat template

* gemma: only apply system_prompt on non-model message
2024-02-22 19:10:21 +01:00
Someone
4cb4d8b22d
workflows: nix: hardcode cachix ids, build unconditionally (#5663)
GitHub does not expose environment and repository variables to PRs coming from forks implies that we've been disabling the Nix CI actions for most PRs. 

The `if:` also didn't make much sense, because we can always pull from cachix, and there's no point (albeit no risk either) in pushing cache for the untrusted code.
2024-02-22 08:32:09 -08:00
Georgi Gerganov
3a03541ced
minor : fix trailing whitespace (#5638) 2024-02-22 13:54:03 +02:00
Georgi Gerganov
41676d9920
ci : actually no reason to exclude GPU code from triggers 2024-02-22 13:33:00 +02:00
Georgi Gerganov
a697cd1314
minor : fix missing new line 2024-02-22 13:29:20 +02:00
Georgi Gerganov
56d03d92be
readme : update hot topics 2024-02-22 10:35:54 +02:00
Xuan Son Nguyen
a46f50747b
server : fallback to chatml, add AlphaMonarch chat template (#5628)
* server: fallback to chatml

* add new chat template

* server: add AlphaMonarch to test chat template

* server: only check model template if there is no custom tmpl

* remove TODO
2024-02-22 10:33:24 +02:00
Alexey Parfenov
c5688c6250
server : clarify some params in the docs (#5640) 2024-02-22 10:27:32 +02:00
Dat Quoc Nguyen
4ef245a92a
mpt : add optional bias tensors (#5638)
Update for MPT with optional bias parameters: to work with PhoGPT and SEA-LION models that were pre-trained with 'bias'.
2024-02-22 10:15:13 +02:00
slaren
973053d8b0
llama : fix loading models with shared tok_embd and output (#5651)
ggml-ci
2024-02-22 00:42:09 +01:00
Xuan Son Nguyen
7c8bcc11dc
Add docs for llama_chat_apply_template (#5645)
* add docs for llama_chat_apply_template

* fix typo
2024-02-22 00:31:00 +01:00
Pierrick HYMBERT
534998dbb9 server: tests: ci tests.sh exit code 2024-02-21 23:06:20 +01:00
slaren
7fe4678b02
llama : fix session save/load with quantized KV (#5649) 2024-02-21 22:52:39 +01:00