Commit graph

  • da2067a0d6 openai: only special-format assistant in thoughtful mode ochafik 2024-03-30 01:55:08 +00:00
  • d9f30f86c8 Update test_chat_handlers.md ochafik 2024-03-30 01:50:44 +00:00
  • 6935503b53 openai: refactor chat handler vs. template ochafik 2024-03-30 01:50:36 +00:00
  • 3c3eff52aa openai: quiet + update prompt output ochafik 2024-03-30 01:15:46 +00:00
  • ad2f4c119a Update test_chat_handlers.py ochafik 2024-03-30 01:10:14 +00:00
  • d8a53eadf2 openai: test features of templates at runtime, to make sure no bits of intel are lost ochafik 2024-03-30 01:00:07 +00:00
  • 61f35e07a5 agent: prepare to test various templates ochafik 2024-03-29 23:04:23 +00:00
  • 22b980ffc3 agent: update readme ochafik 2024-03-29 20:16:55 +00:00
  • dd11bb6937 agent: format still broken ochafik 2024-03-29 19:41:11 +00:00
  • ff6563a7bb Delete test.sh ochafik 2024-03-29 19:23:09 +00:00
  • 3da30ed89e agent: fix functionary tool_calls templating ochafik 2024-03-29 19:22:59 +00:00
  • eb9a5524eb agent: nits ochafik 2024-03-29 19:22:46 +00:00
  • d1d86027c4 agent: disable parallel by default ochafik 2024-03-29 19:22:15 +00:00
  • b4e292ec01 Create requirements.txt ochafik 2024-03-29 18:19:28 +00:00
  • e0c8af4ba0 agent: --style ochafik 2024-03-29 18:09:31 +00:00
  • 9ab493f67e Update prompting.py ochafik 2024-03-29 17:11:03 +00:00
  • 80c793047b openai: fix message merging for mixtral (parallel calls) ochafik 2024-03-29 17:01:20 +00:00
  • ea34bd3e5c agent/openai:nits ochafik 2024-03-29 17:00:53 +00:00
  • ce2fb0155f agent: add --allow_parallel_calls ochafik 2024-03-29 16:40:23 +00:00
  • c340e8cd3b Update example_weather_tools.py ochafik 2024-03-29 16:24:59 +00:00
  • b63f91ade4 Update agent.py ochafik 2024-03-29 16:19:05 +00:00
  • e874565a13 agent: split code from openai example ochafik 2024-03-29 16:17:59 +00:00
  • 253b68d9a7 server.py: crude reactor ochafik 2024-03-29 03:24:29 +00:00
  • 59b411406f server.py: refactor chat handlers ochafik 2024-03-29 02:47:33 +00:00
  • 5f3de16116 server.py: pass all request options, comments in ts sigs, render tool calls ochafik 2024-03-28 23:57:14 +00:00
  • 63a384deaf server.py: raise n_predict ochafik 2024-03-28 00:42:12 +00:00
  • a4062935a5 server.py: reenable grammar, accommodate mistral's escaped underscores ochafik 2024-03-27 23:08:30 +00:00
  • aa9605c751 server.py: kinda api-compliant output, disabled grammar ochafik 2024-03-27 01:50:26 +00:00
  • 8afd4de17b server.py: make tools work w/ mixtral-8x7b-instruct ochafik 2024-03-27 00:12:14 +00:00
  • d5d9993679 server.py: default tools work! ochafik 2024-03-26 20:58:03 +00:00
  • ffc74360e2 agents: scripts to run scripts as sandboxed fastapi servers ochafik 2024-03-26 01:26:45 +00:00
  • 63d13245e1 server.py: hacky code ochafik 2024-03-25 23:57:25 +00:00
  • 0d1d46ef1d grammars: add troubleshooting section to readme ochafik 2024-04-08 20:10:15 +01:00
  • 0d47c43a98 gguf: add GGUFReader.read_field(field) method + read template example ochafik 2024-04-27 23:11:34 +01:00
  • 2df79ff950 typo ngxson 2024-04-27 21:24:25 +02:00
  • 4dba7e8114
    Replace "alternative" boolean operator in conditional compilation directive (#6949) b2751 mgroeber9110 2024-04-27 21:02:06 +02:00
  • e5672d33cb Fixes mann1x 2024-04-27 20:55:45 +02:00
  • 874c3411c2 support splits in convert.py Christian Azinn 2024-04-27 00:08:55 -04:00
  • fa125a10bb Fix typo mann1x 2024-04-27 20:30:03 +02:00
  • 5ae78a1f23 Convert unsupported datatypes to f32 when converting BERT architectures to GGUF Christian Azinn 2024-04-26 18:26:21 -04:00
  • f97fa9b24c curl: nit: no need for multiline regex flag ochafik 2024-04-27 19:10:03 +01:00
  • 4ebb5e6522 Fix incorrect boolean operator in conditional compilation directive mgroeber9110 2024-04-27 19:15:41 +02:00
  • f70e4d6c6d Merge remote-tracking branch 'origin/master' into model-args ochafik 2024-04-27 16:50:56 +01:00
  • b7368332e2
    ci: server: tests python env on github container ubuntu latest / fix n_predict (#6935) b2750 Pierrick Hymbert 2024-04-27 17:50:48 +02:00
  • 581c4a0239
    unicode : try fix windows Georgi Gerganov 2024-04-27 18:36:00 +03:00
  • abffd1bc5f curl: unique_ptr to manage lifecycle of curl & outfile ochafik 2024-04-27 16:24:48 +01:00
  • 4c4dc25003 curl: reuse regex across headers callback calls ochafik 2024-04-27 15:58:55 +01:00
  • 91eaa414bf
    unicode : support \p{N}, \p{L} and \p{P} natively Georgi Gerganov 2024-04-27 17:48:38 +03:00
  • 5c4aea1498 curl: rm legacy .etag file support ochafik 2024-04-27 15:25:01 +01:00
  • ce5485aee0
    unicode : always use std::wregex Georgi Gerganov 2024-04-27 17:11:34 +03:00
  • 49c1657821 Fixes mann1x 2024-04-27 15:41:00 +02:00
  • 509c28f4a1 no duplicated tensor while reading gguf ngxson 2024-04-27 15:31:52 +02:00
  • 1282a7bc26
    One more opacity adjustment JohnnyB 2024-04-27 13:07:59 +01:00
  • b01716a653 Added worker threads sticking to a single core for Linux mann1x 2024-04-27 13:00:33 +02:00
  • d55ae1513c Added one worker thread per core on Windows mann1x 2024-04-27 12:17:05 +02:00
  • 2affd0b221
    unicode : set bomb Georgi Gerganov 2024-04-27 11:56:02 +03:00
  • a22645c2a7
    unicode : set bomb Georgi Gerganov 2024-04-27 11:48:24 +03:00
  • 4434c9d6c2
    minor Georgi Gerganov 2024-04-27 11:33:16 +03:00
  • ad929833cb
    llama : adapt punctuation regex + add llama 3 regex Georgi Gerganov 2024-04-27 11:06:08 +03:00
  • 96965f67e6
    models : add llama v3 vocab file Georgi Gerganov 2024-04-27 11:05:12 +03:00
  • 16c093543a
    fixed health monitoring, added feature to terminate tasks if server stopped rahsuri 2024-04-26 22:41:16 -04:00
  • 7666c4c059 Implemented basic interface for llamacheck and link to weights, adapting from simple.cpp Andrew Ferrouolo 2024-04-26 22:28:22 -04:00
  • 9710c6dafe
    added some libraries rahsuri 2024-04-26 22:12:06 -04:00
  • d9bdf450af
    health checking for request cancellations rahsuri 2024-04-26 22:07:03 -04:00
  • 8cbca9aa7f
    Add files via upload rahsuri 2024-04-26 22:04:30 -04:00
  • c160818ec0
    wip Georgi Gerganov 2024-04-27 00:28:36 +03:00
  • a3764f8f04 ci: server: fix windows is not building PR branch Pierrick HYMBERT 2024-04-26 22:12:46 +02:00
  • ed979270d2 ci: server: fix server tests after #6638 Pierrick HYMBERT 2024-04-26 21:43:29 +02:00
  • d3b1c4e953 ci: server: fix python env Pierrick HYMBERT 2024-04-26 21:31:26 +02:00
  • a774d7084e
    make : add test-tokenizer-0-llama-v3 Georgi Gerganov 2024-04-26 21:25:36 +03:00
  • 8791e94e3c
    lint : fix Georgi Gerganov 2024-04-26 21:12:05 +03:00
  • 928e0b7013
    Reset schedule earlier to allow overlap with ggml graph computation on device (#6933) b2749 agray3 2024-04-26 19:08:30 +01:00
  • 728562bc12
    style fix slaren 2024-04-26 20:08:05 +02:00
  • 0c4d489e29
    quantize: add imatrix and dataset metadata in GGUF (#6658) b2748 Pierrick Hymbert 2024-04-26 20:06:33 +02:00
  • 1b9b79dd14
    convert : fix pre-tokenizer type writing Georgi Gerganov 2024-04-26 20:55:14 +03:00
  • 34847caa9a moved reset to end of llama_decode_internal Alan Gray 2024-04-26 10:24:02 -07:00
  • 43e12ce8e5
    llama : use new pre-tokenizer type Georgi Gerganov 2024-04-26 20:08:28 +03:00
  • 50599208d6 Fix log target Daniel Hiltgen 2024-04-25 20:54:18 -07:00
  • 017e6999b5
    add basic tensor data validation function (#6884) b2747 slaren 2024-04-26 18:39:58 +02:00
  • eeb3d5882a curl: support legacy .etag / .lastModified companion files Olivier Chafik 2024-04-26 17:36:56 +01:00
  • 9b4d63ae53
    convert : add "tokenizer.ggml.pre" GGUF KV (wip) Georgi Gerganov 2024-04-26 19:21:55 +03:00
  • a2beaffec8 Reset schedule earlier to allow overlap with graph computation on device Alan Gray 2024-04-26 06:16:56 -07:00
  • 8ddd0228ff fix QK_K == 64 slaren 2024-04-26 17:13:53 +02:00
  • 41de83d5ca fix neon reinterpret type slaren 2024-04-26 17:13:30 +02:00
  • e3f6dc7409
    Merge branch 'master' into gg/bpe-preprocess Georgi Gerganov 2024-04-26 18:08:40 +03:00
  • e2764cd7ca
    gguf : fix mismatch between alloc and free functions (#6929) b2746 slaren 2024-04-26 17:07:42 +02:00
  • ea591858b8 validate data asynchronously when possible slaren 2024-04-26 04:10:36 +02:00
  • 047291fb42 spacing and capitalization changes. Fix the register list of GGML_5bit_Unpacked_Unaligned. Julia Longtin 2024-04-26 14:44:08 +00:00
  • 261d3dbad9
    Opacity action trigger. JohnnyB 2024-04-26 15:37:36 +01:00
  • 5ce50f6a8b args: fix update to quantize-stats.cpp Olivier Chafik 2024-04-26 15:31:49 +01:00
  • 828661cfe6 gguf : fix mismatch between alloc and free functions slaren 2024-04-26 16:14:36 +02:00
  • 4b1c3c98b4
    llamafile : use 64-bit integers in sgemm (#6928) Justine Tunney 2024-04-26 10:05:33 -04:00
  • 0664e9b321 curl: check url of previous download (.json metadata w/ url, etag & lastModified) Olivier Chafik 2024-04-26 14:12:15 +01:00
  • c0ee4d52b7
    llamafile : use 64-bit integers in sgemm Justine Tunney 2024-04-26 06:31:07 -07:00
  • e55dfde3b0 args: define DEFAULT_MODEL_PATH + update cli docs Olivier Chafik 2024-04-26 14:33:59 +01:00
  • f7d2c0a5cd Added set thread affinity for Linux mann1x 2024-04-26 15:09:17 +02:00
  • e9891769ff
    unicode : first try custom implementations Georgi Gerganov 2024-04-26 15:09:07 +03:00
  • e8c206be61
    unicode : shot in the dark to fix tests on Windows Georgi Gerganov 2024-04-26 14:57:12 +03:00
  • 4907e41aa7
    llama : towards llama3 tokenization support (wip) Georgi Gerganov 2024-04-26 14:55:03 +03:00
  • ed42711b90
    gguf-py : reader prints warnings on duplicate keys Georgi Gerganov 2024-04-26 14:32:22 +03:00