Pierrick HYMBERT
d159e29d4b
server: test: ci fix openblas build
2024-02-23 11:34:38 +01:00
Pierrick HYMBERT
606738eeef
server: test: ci fix clblast
2024-02-23 11:32:25 +01:00
Pierrick HYMBERT
fa51baca9a
server: test: ci fix matrix
2024-02-23 11:30:24 +01:00
Pierrick HYMBERT
2edd995f2a
server: test: ci fix cublas build
2024-02-23 11:27:19 +01:00
Pierrick HYMBERT
e4fb790077
server: test: ci fix cuda build
2024-02-23 11:19:49 +01:00
Pierrick HYMBERT
fce2e00023
server: tests: ci : fix cuda install
2024-02-23 10:58:59 +01:00
Pierrick HYMBERT
334902b13e
server: tests: ci : fix step id duplicated
2024-02-23 10:56:07 +01:00
Pierrick HYMBERT
86896aadd0
server: tests: ci : continue on error
2024-02-23 10:53:46 +01:00
Pierrick HYMBERT
68cd1a4c16
server: tests: ci : matrix cuda
2024-02-23 10:46:17 +01:00
Pierrick HYMBERT
12bb797193
server: tests: ci : add git
2024-02-23 10:41:41 +01:00
Pierrick HYMBERT
29f8833058
server: tests: ci : fix wget missing
2024-02-23 10:39:45 +01:00
Pierrick HYMBERT
0b0f0565dd
server: tests: ci : build and run tests for all matrix defines, sanitizer and type
2024-02-23 10:33:21 +01:00
Pierrick HYMBERT
36ddb962d8
server: tests: parallel fix server is started twice, add colors to help to monitor in the CI jobs
2024-02-23 10:09:19 +01:00
Pierrick HYMBERT
530d3ae4c4
server: tests: reducing sleep time during scenario
2024-02-23 02:38:54 +01:00
Pierrick HYMBERT
bedf37c9d1
server: tests: reducing n_ctx and n_predict for // prompts as it is too slow in the CI.
2024-02-23 02:38:37 +01:00
Pierrick HYMBERT
5110de08e3
server: tests: fix coloring console
2024-02-23 02:31:44 +01:00
Pierrick HYMBERT
6bba3be151
server: tests: ci adding psmisc as it is not present by default in ubuntu base killall
2024-02-23 02:31:30 +01:00
Pierrick HYMBERT
6e71126c12
server: tests: ci adding curl as it is not present by default in ubuntu base for the hf.sh script
2024-02-23 02:19:47 +01:00
Pierrick HYMBERT
d0e0050843
server: tests: ci adding python3-pip as it is not present by default in ubuntu base
2024-02-23 02:16:56 +01:00
Pierrick HYMBERT
2bb4732c01
server: tests: ci adding cmake as it is not present by default in ubuntu base
2024-02-23 02:13:30 +01:00
Pierrick HYMBERT
6a215e5359
server: tests: ci adding container to specify server port and allow the server to listen to
2024-02-23 02:11:56 +01:00
Pierrick HYMBERT
2f756f84df
server: tests: allow to override the server port before launching tests
2024-02-23 01:59:29 +01:00
Pierrick HYMBERT
70e90558ae
server: tests: add log in server start to identify why the server does not listen on the CI
2024-02-23 01:46:08 +01:00
Pierrick HYMBERT
b38b9e60a1
server: tests: minor fix server --alias param passed twice
2024-02-23 01:31:56 +01:00
Pierrick HYMBERT
14b6ede152
server: tests: minor color change
2024-02-23 01:29:39 +01:00
Pierrick HYMBERT
1bd07e56c4
server: tests: assert embeddings are actually computed, make the embeddings endpoint configurable.
...
Add logs to investigate why the CI server test job is not starting
2024-02-23 01:25:08 +01:00
Pierrick HYMBERT
cba6d4ea17
server: tests: minor fix missing param.
2024-02-23 00:54:44 +01:00
Pierrick HYMBERT
51f527440a
server: tests: ci triggered on any changes on server example path
2024-02-23 00:37:42 +01:00
Pierrick HYMBERT
26b66c5496
server: tests: Fix some random behavior where the wait for busy status is missing
2024-02-22 23:38:47 +01:00
Pierrick HYMBERT
aa591ef12d
server: tests: add Multi users with total number of tokens to predict exceeds the KV Cache size
2024-02-22 23:37:56 +01:00
Pierrick HYMBERT
f820e10fa7
server: tests: ci ensure the server is stopped before scenario, and do not quit while the server is listening
2024-02-22 23:18:42 +01:00
Pierrick HYMBERT
8b96bdaf08
Merge remote-tracking branch 'origin/master' into test/server-add-ci-test
2024-02-22 22:11:36 +01:00
Pierrick HYMBERT
597c181abb
server: tests: ci do not take a model anymore, fix trigger patch
2024-02-22 21:58:28 +01:00
Pierrick HYMBERT
e43406e36d
server: tests: switch to asyncio for concurrent tests, match result content with regex
2024-02-22 21:55:40 +01:00
Pierrick HYMBERT
016b221549
server: fix health/slots endpoint slot state access available race condition
2024-02-22 21:55:18 +01:00
Someone
201294ae17
nix: init singularity and docker images ( #5056 )
...
Exposes a few attributes demonstrating how to build [singularity](https://docs.sylabs.io/guides/latest/user-guide/ )/[apptainer](https://apptainer.org/ ) and Docker images re-using llama.cpp's Nix expression.
Built locally on `x86_64-linux` with `nix build github:someoneserge/llama.cpp/feat/nix/images#llamaPackages.{docker,docker-min,sif,llama-cpp}` and it's fast and effective.
2024-02-22 11:44:10 -08:00
Georgi Gerganov
5a9e2f60ba
py : minor fixes ( #5668 )
2024-02-22 20:13:25 +02:00
Xuan Son Nguyen
373ee3fbba
Add Gemma chat template ( #5665 )
...
* add gemma chat template
* gemma: only apply system_prompt on non-model message
2024-02-22 19:10:21 +01:00
Someone
4cb4d8b22d
workflows: nix: hardcode cachix ids, build unconditionally ( #5663 )
...
GitHub does not expose environment and repository variables to PRs coming from forks implies that we've been disabling the Nix CI actions for most PRs.
The `if:` also didn't make much sense, because we can always pull from cachix, and there's no point (albeit no risk either) in pushing cache for the untrusted code.
2024-02-22 08:32:09 -08:00
Georgi Gerganov
3a03541ced
minor : fix trailing whitespace ( #5638 )
2024-02-22 13:54:03 +02:00
Georgi Gerganov
41676d9920
ci : actually no reason to exclude GPU code from triggers
2024-02-22 13:33:00 +02:00
Georgi Gerganov
a697cd1314
minor : fix missing new line
2024-02-22 13:29:20 +02:00
Georgi Gerganov
56d03d92be
readme : update hot topics
2024-02-22 10:35:54 +02:00
Xuan Son Nguyen
a46f50747b
server : fallback to chatml, add AlphaMonarch chat template ( #5628 )
...
* server: fallback to chatml
* add new chat template
* server: add AlphaMonarch to test chat template
* server: only check model template if there is no custom tmpl
* remove TODO
2024-02-22 10:33:24 +02:00
Alexey Parfenov
c5688c6250
server : clarify some params in the docs ( #5640 )
2024-02-22 10:27:32 +02:00
Dat Quoc Nguyen
4ef245a92a
mpt : add optional bias tensors ( #5638 )
...
Update for MPT with optional bias parameters: to work with PhoGPT and SEA-LION models that were pre-trained with 'bias'.
2024-02-22 10:15:13 +02:00
slaren
973053d8b0
llama : fix loading models with shared tok_embd and output ( #5651 )
...
ggml-ci
2024-02-22 00:42:09 +01:00
Xuan Son Nguyen
7c8bcc11dc
Add docs for llama_chat_apply_template ( #5645 )
...
* add docs for llama_chat_apply_template
* fix typo
2024-02-22 00:31:00 +01:00
Pierrick HYMBERT
534998dbb9
server: tests: ci tests.sh exit code
2024-02-21 23:06:20 +01:00
slaren
7fe4678b02
llama : fix session save/load with quantized KV ( #5649 )
2024-02-21 22:52:39 +01:00