Commit graph

2832 commits

Author SHA1 Message Date
ochafik
3c3eff52aa openai: quiet + update prompt output 2024-04-27 23:14:11 +01:00
ochafik
ad2f4c119a Update test_chat_handlers.py 2024-04-27 23:14:11 +01:00
ochafik
d8a53eadf2 openai: test features of templates at runtime, to make sure no bits of intel are lost 2024-04-27 23:14:11 +01:00
ochafik
61f35e07a5 agent: prepare to test various templates 2024-04-27 23:14:11 +01:00
ochafik
22b980ffc3 agent: update readme 2024-04-27 23:14:11 +01:00
ochafik
dd11bb6937 agent: format still broken 2024-04-27 23:14:11 +01:00
ochafik
ff6563a7bb Delete test.sh 2024-04-27 23:14:11 +01:00
ochafik
3da30ed89e agent: fix functionary tool_calls templating 2024-04-27 23:14:11 +01:00
ochafik
eb9a5524eb agent: nits 2024-04-27 23:14:11 +01:00
ochafik
d1d86027c4 agent: disable parallel by default 2024-04-27 23:14:11 +01:00
ochafik
b4e292ec01 Create requirements.txt 2024-04-27 23:14:11 +01:00
ochafik
e0c8af4ba0 agent: --style 2024-04-27 23:14:11 +01:00
ochafik
9ab493f67e Update prompting.py 2024-04-27 23:14:11 +01:00
ochafik
80c793047b openai: fix message merging for mixtral (parallel calls) 2024-04-27 23:14:11 +01:00
ochafik
ea34bd3e5c agent/openai:nits 2024-04-27 23:14:11 +01:00
ochafik
ce2fb0155f agent: add --allow_parallel_calls 2024-04-27 23:14:11 +01:00
ochafik
c340e8cd3b Update example_weather_tools.py 2024-04-27 23:14:11 +01:00
ochafik
b63f91ade4 Update agent.py 2024-04-27 23:14:11 +01:00
ochafik
e874565a13 agent: split code from openai example 2024-04-27 23:14:11 +01:00
ochafik
253b68d9a7 server.py: crude reactor 2024-04-27 23:14:11 +01:00
ochafik
59b411406f server.py: refactor chat handlers 2024-04-27 23:14:11 +01:00
ochafik
5f3de16116 server.py: pass all request options, comments in ts sigs, render tool calls 2024-04-27 23:14:11 +01:00
ochafik
63a384deaf server.py: raise n_predict 2024-04-27 23:14:11 +01:00
ochafik
a4062935a5 server.py: reenable grammar, accommodate mistral's escaped underscores 2024-04-27 23:14:11 +01:00
ochafik
aa9605c751 server.py: kinda api-compliant output, disabled grammar 2024-04-27 23:14:11 +01:00
ochafik
8afd4de17b server.py: make tools work w/ mixtral-8x7b-instruct 2024-04-27 23:14:11 +01:00
ochafik
d5d9993679 server.py: default tools work! 2024-04-27 23:14:11 +01:00
ochafik
ffc74360e2 agents: scripts to run scripts as sandboxed fastapi servers 2024-04-27 23:14:11 +01:00
ochafik
63d13245e1 server.py: hacky code 2024-04-27 23:14:11 +01:00
ochafik
0d1d46ef1d grammars: add troubleshooting section to readme 2024-04-27 23:14:11 +01:00
ochafik
0d47c43a98 gguf: add GGUFReader.read_field(field) method + read template example 2024-04-27 23:11:34 +01:00
mgroeber9110
4dba7e8114
Replace "alternative" boolean operator in conditional compilation directive (#6949) 2024-04-27 21:02:06 +02:00
Pierrick Hymbert
b7368332e2
ci: server: tests python env on github container ubuntu latest / fix n_predict (#6935)
* ci: server: fix python env

* ci: server: fix server tests after #6638

* ci: server: fix windows is not building PR branch
2024-04-27 17:50:48 +02:00
agray3
928e0b7013
Reset schedule earlier to allow overlap with ggml graph computation on device (#6933)
* Reset schedule earlier to allow overlap with graph computation on device
2024-04-26 20:08:30 +02:00
Pierrick Hymbert
0c4d489e29
quantize: add imatrix and dataset metadata in GGUF (#6658)
* imatrix: save the dataset file used in the output file

* llama: support kv overrides type string string

* common: factorize KV Overrides parsing between common and server

* quantize: add imatrix n entries and dataset KV metadata
quantize: factorize KV Overrides parsing between common
#6656

* llama: remove kv override str_value initialization as it does not compile on some toolchain

* quantize: add imatrix m_last_call as `quantize.imatrix.chunks_count`

* quantize: add imatrix filename in KV

* llama: add llama_model_kv_override_free

* common: add llama_model_kv_override_free
common: free kv override if used after model loading

* llama: finally move the string KV override value to the stack

* llama : minor

* no need to add a NUL to the std::vector, std::string can be initialized from a pair of iterators.

Co-authored-by: slaren <slarengh@gmail.com>

* kv override: ensure string termination

---------

Co-authored-by: Georgi Gerganov <ggerganov@gmail.com>
Co-authored-by: slaren <slarengh@gmail.com>
2024-04-26 20:06:33 +02:00
slaren
017e6999b5
add basic tensor data validation function (#6884)
* add basic tensor data validation function

* add --check-tensors command line argument

tensor validation is disabled by default and can be enabled by adding
`--check-tensors` to the command line arguments.

quantize always validates tensors.
2024-04-26 18:39:58 +02:00
slaren
e2764cd7ca
gguf : fix mismatch between alloc and free functions (#6929) 2024-04-26 18:07:42 +03:00
Justine Tunney
4b1c3c98b4
llamafile : use 64-bit integers in sgemm (#6928) 2024-04-26 17:05:33 +03:00
Pierrick Hymbert
bbe3c6e761
ci: server: fix python installation (#6925) 2024-04-26 12:27:25 +02:00
Pierrick Hymbert
7f5ff558ee
server: stop generation at n_ctx_train if n_predict is not set (#6638)
* server: cap n_predict if not set to n_ctx_train

* server: fix infinite loop

* server: infinite loop, move in process_token
server: infinite loop: set stop limit to true

* minor: spaces

* minor: spaces

* server: include prompt tokens in the EOS limit
2024-04-26 12:15:30 +02:00
Pierrick Hymbert
9e4e077ec5
ci: server: fix python installation (#6922) 2024-04-26 11:11:51 +02:00
Georgi Gerganov
83b72cb086
Merge pull request from GHSA-p5mv-gjc5-mwqv
* always use calloc

clamp n_kv on failure to read a kv

* ggml : alternative ctx->header.n_kv update

---------

Co-authored-by: slaren <slarengh@gmail.com>
2024-04-26 10:41:53 +03:00
Pierrick Hymbert
d4a9afc100
ci: server: fix python installation (#6918) 2024-04-26 09:27:49 +02:00
Pierrick Hymbert
7d641c26ac
ci: fix concurrency for pull_request_target (#6917) 2024-04-26 09:26:59 +02:00
Pierrick Hymbert
5790c8dac1
bench: server add stop word for PHI-2 (#6916) 2024-04-26 09:26:16 +02:00
vik
46e12c4692
llava : add support for moondream vision language model (#6899)
* add support for moondream vision language model

This required making the following changes to the CLIP model:

1. Support for patch embedding bias.
2. Make class embedding and pre-layernorm optional.
3. Add support for post-layernorm.

* Update examples/llava/clip.cpp

---------

Co-authored-by: Georgi Gerganov <ggerganov@gmail.com>
2024-04-25 22:38:31 +03:00
Georgi Gerganov
dba497e0c1
cmake : restore LLAMA_LLAMAFILE_DEFAULT 2024-04-25 21:37:27 +03:00
Georgi Gerganov
fa0b4ad252
cmake : remove obsolete ANDROID check 2024-04-25 18:59:51 +03:00
slaren
d6e1d44f16
llama : synchronize before get/set session data (#6911) 2024-04-25 17:59:03 +02:00
Georgi Gerganov
853d06ffe2
ci : tmp disable slow tests 2024-04-25 17:06:27 +03:00