Commit graph

2791 commits

Author SHA1 Message Date
Olivier Chafik
0532680f40 agent: nits 2024-04-27 23:15:45 +01:00
Olivier Chafik
6880f1d4c0 agent: support basic openapi tools (incl. from fastify sandbox) 2024-04-27 23:14:11 +01:00
Olivier Chafik
85820f4401 agent: fix sandbox dockerfile 2024-04-27 23:14:11 +01:00
ochafik
b447a743fb agent: revert to json schemas (ts not ready for refs) 2024-04-27 23:14:11 +01:00
ochafik
701a66d80f agent: fix response_format 2024-04-27 23:14:11 +01:00
ochafik
6e52a9ce48 Update test_chat_handlers.md 2024-04-27 23:14:11 +01:00
ochafik
22fe86d8b8 openai tools: TS signatures work well too at a fraction of the eval cost 2024-04-27 23:14:11 +01:00
ochafik
19811a4011 openai: tests didn't catch output format 2024-04-27 23:14:11 +01:00
ochafik
09de4eb9ed openai: actually use thoughtful examples in tests 2024-04-27 23:14:11 +01:00
ochafik
da2067a0d6 openai: only special-format assistant in thoughtful mode 2024-04-27 23:14:11 +01:00
ochafik
d9f30f86c8 Update test_chat_handlers.md 2024-04-27 23:14:11 +01:00
ochafik
6935503b53 openai: refactor chat handler vs. template 2024-04-27 23:14:11 +01:00
ochafik
3c3eff52aa openai: quiet + update prompt output 2024-04-27 23:14:11 +01:00
ochafik
ad2f4c119a Update test_chat_handlers.py 2024-04-27 23:14:11 +01:00
ochafik
d8a53eadf2 openai: test features of templates at runtime, to make sure no bits of intel are lost 2024-04-27 23:14:11 +01:00
ochafik
61f35e07a5 agent: prepare to test various templates 2024-04-27 23:14:11 +01:00
ochafik
22b980ffc3 agent: update readme 2024-04-27 23:14:11 +01:00
ochafik
dd11bb6937 agent: format still broken 2024-04-27 23:14:11 +01:00
ochafik
ff6563a7bb Delete test.sh 2024-04-27 23:14:11 +01:00
ochafik
3da30ed89e agent: fix functionary tool_calls templating 2024-04-27 23:14:11 +01:00
ochafik
eb9a5524eb agent: nits 2024-04-27 23:14:11 +01:00
ochafik
d1d86027c4 agent: disable parallel by default 2024-04-27 23:14:11 +01:00
ochafik
b4e292ec01 Create requirements.txt 2024-04-27 23:14:11 +01:00
ochafik
e0c8af4ba0 agent: --style 2024-04-27 23:14:11 +01:00
ochafik
9ab493f67e Update prompting.py 2024-04-27 23:14:11 +01:00
ochafik
80c793047b openai: fix message merging for mixtral (parallel calls) 2024-04-27 23:14:11 +01:00
ochafik
ea34bd3e5c agent/openai:nits 2024-04-27 23:14:11 +01:00
ochafik
ce2fb0155f agent: add --allow_parallel_calls 2024-04-27 23:14:11 +01:00
ochafik
c340e8cd3b Update example_weather_tools.py 2024-04-27 23:14:11 +01:00
ochafik
b63f91ade4 Update agent.py 2024-04-27 23:14:11 +01:00
ochafik
e874565a13 agent: split code from openai example 2024-04-27 23:14:11 +01:00
ochafik
253b68d9a7 server.py: crude reactor 2024-04-27 23:14:11 +01:00
ochafik
59b411406f server.py: refactor chat handlers 2024-04-27 23:14:11 +01:00
ochafik
5f3de16116 server.py: pass all request options, comments in ts sigs, render tool calls 2024-04-27 23:14:11 +01:00
ochafik
63a384deaf server.py: raise n_predict 2024-04-27 23:14:11 +01:00
ochafik
a4062935a5 server.py: reenable grammar, accommodate mistral's escaped underscores 2024-04-27 23:14:11 +01:00
ochafik
aa9605c751 server.py: kinda api-compliant output, disabled grammar 2024-04-27 23:14:11 +01:00
ochafik
8afd4de17b server.py: make tools work w/ mixtral-8x7b-instruct 2024-04-27 23:14:11 +01:00
ochafik
d5d9993679 server.py: default tools work! 2024-04-27 23:14:11 +01:00
ochafik
ffc74360e2 agents: scripts to run scripts as sandboxed fastapi servers 2024-04-27 23:14:11 +01:00
ochafik
63d13245e1 server.py: hacky code 2024-04-27 23:14:11 +01:00
ochafik
0d1d46ef1d grammars: add troubleshooting section to readme 2024-04-27 23:14:11 +01:00
agray3
928e0b7013
Reset schedule earlier to allow overlap with ggml graph computation on device (#6933)
* Reset schedule earlier to allow overlap with graph computation on device
2024-04-26 20:08:30 +02:00
Pierrick Hymbert
0c4d489e29
quantize: add imatrix and dataset metadata in GGUF (#6658)
* imatrix: save the dataset file used in the output file

* llama: support kv overrides type string string

* common: factorize KV Overrides parsing between common and server

* quantize: add imatrix n entries and dataset KV metadata
quantize: factorize KV Overrides parsing between common
#6656

* llama: remove kv override str_value initialization as it does not compile on some toolchain

* quantize: add imatrix m_last_call as `quantize.imatrix.chunks_count`

* quantize: add imatrix filename in KV

* llama: add llama_model_kv_override_free

* common: add llama_model_kv_override_free
common: free kv override if used after model loading

* llama: finally move the string KV override value to the stack

* llama : minor

* no need to add a NUL to the std::vector, std::string can be initialized from a pair of iterators.

Co-authored-by: slaren <slarengh@gmail.com>

* kv override: ensure string termination

---------

Co-authored-by: Georgi Gerganov <ggerganov@gmail.com>
Co-authored-by: slaren <slarengh@gmail.com>
2024-04-26 20:06:33 +02:00
slaren
017e6999b5
add basic tensor data validation function (#6884)
* add basic tensor data validation function

* add --check-tensors command line argument

tensor validation is disabled by default and can be enabled by adding
`--check-tensors` to the command line arguments.

quantize always validates tensors.
2024-04-26 18:39:58 +02:00
slaren
e2764cd7ca
gguf : fix mismatch between alloc and free functions (#6929) 2024-04-26 18:07:42 +03:00
Justine Tunney
4b1c3c98b4
llamafile : use 64-bit integers in sgemm (#6928) 2024-04-26 17:05:33 +03:00
Pierrick Hymbert
bbe3c6e761
ci: server: fix python installation (#6925) 2024-04-26 12:27:25 +02:00
Pierrick Hymbert
7f5ff558ee
server: stop generation at n_ctx_train if n_predict is not set (#6638)
* server: cap n_predict if not set to n_ctx_train

* server: fix infinite loop

* server: infinite loop, move in process_token
server: infinite loop: set stop limit to true

* minor: spaces

* minor: spaces

* server: include prompt tokens in the EOS limit
2024-04-26 12:15:30 +02:00
Pierrick Hymbert
9e4e077ec5
ci: server: fix python installation (#6922) 2024-04-26 11:11:51 +02:00