Commit graph

  • c2926e4bd9 Update README.md ochafik 2024-10-24 06:40:16 +01:00
  • d338bfb87f agent: ditch aiohttp & define REQUESTS_CA_BUNDLE to fix http proxying / trust the self-signed cert from python ochafik 2024-10-24 06:35:37 +01:00
  • 0f5d63943f agent: display http errors nicely ochafik 2024-10-24 05:40:58 +01:00
  • f5320af02a tool-call: return tool_call.id (required by Nemo) ochafik 2024-10-24 05:40:15 +01:00
  • 267e630c14 agent: isolate tools container + log its outgoing HTTP & HTTPS traffic w/ docker compose + self-signed squid proxy ochafik 2024-10-24 05:38:54 +01:00
  • 13038930af DRY: Fixed redundant code wwoodsTM 2024-10-23 18:40:24 -06:00
  • 0ec847315b DRY: Trying to fix coauthors, removed unneeded line wwoodsTM 2024-10-23 17:32:50 -06:00
  • 9697338d2d sampling : add DRY sampler (post-refactor) #9702 wwoodsTM 2024-10-23 16:21:04 -06:00
  • b550011be3 fix infinite generation loop Xuan Son Nguyen 2024-10-24 00:03:00 +02:00
  • 4004bb7015 revise test code Johannes Gäßler 2024-10-23 23:53:58 +02:00
  • 60d4194bfe fix incorrect if branch Xuan Son Nguyen 2024-10-23 23:48:08 +02:00
  • 3abc33962e Merge branch 'master' into xsn/refactor_server_slot_input Xuan Son Nguyen 2024-10-23 23:43:42 +02:00
  • 5c749bea00 move prompt_tokens.empty() check Xuan Son Nguyen 2024-10-23 23:39:48 +02:00
  • 125835b253 server : refactor slot input data, move tokenizer to HTTP thread Xuan Son Nguyen 2024-10-23 23:09:54 +02:00
  • 4a29bca867 update vulkan target name Zack Zhiyuan Li 2024-10-23 20:54:39 +00:00
  • 414f6f1b30 Merge branch 'tool-call' of github.com:ochafik/llama.cpp into tool-call ochafik 2024-10-23 21:22:08 +01:00
  • 4394e1cd5e Update tool-call.cpp ochafik 2024-10-23 21:21:39 +01:00
  • 0a1c750c80
    server : samplers accept the prompt correctly (#10019) b3970 wwoodsTM 2024-10-23 13:27:51 -06:00
  • e41cfc7ab8 llama: Refactor string_split to use template specialization, fixes parsing strings with spaces Michael Podvitskiy 2024-10-23 21:19:47 +02:00
  • b87dc5a44f CUDA: fix MMQ for non-contiguous src0, add tests Johannes Gäßler 2024-10-23 17:21:41 +02:00
  • 25ad631f6d Server - Sampling bug fix wwoodsTM 2024-10-23 10:53:37 -06:00
  • 7d5e0c0063 fix join license list momonga 2024-10-24 00:08:05 +09:00
  • edf2583fbc
    Merge branch 'ggerganov:master' into master momonga 2024-10-23 23:30:23 +09:00
  • 190a37d797
    sync : ggml b3969 Georgi Gerganov 2024-10-23 17:23:55 +03:00
  • 2d3aba9ee8
    llama.vim : bump generation time limit to 3s [no ci] Georgi Gerganov 2024-10-23 17:16:56 +03:00
  • 80273a306d CUDA: fix 1D im2col, add tests (ggml/993) b3967 Johannes Gäßler 2024-10-18 09:24:44 +02:00
  • c19af0acb1 ggml : remove redundant set of contexts used field (ggml/978) Daniel Bevenius 2024-10-16 20:10:01 +02:00
  • 16c5486d1a
    CUDA: fix 1D im2col, add tests (ggml/993) Johannes Gäßler 2024-10-18 09:24:44 +02:00
  • fc73924db2
    ggml : remove redundant set of contexts used field (ggml/978) Daniel Bevenius 2024-10-16 20:10:01 +02:00
  • 20011f15fc
    llama : switch KQ multiplication to use F32 precision by default Georgi Gerganov 2024-10-23 14:32:27 +03:00
  • a6679d94bc
    Add granite template to test-chat-template.cpp arch-btw 2024-10-23 04:16:42 -07:00
  • 3614aac2a3
    Merge branch 'ggerganov:master' into master arch-btw 2024-10-23 04:13:46 -07:00
  • e74773ec43
    Add granite template to llama.cpp arch-btw 2024-10-23 04:12:53 -07:00
  • ac113a0fee
    llama.vim : add classic vim support (#9995) b3965 Michael Coppola 2024-10-23 07:09:26 -04:00
  • 4c9388fb96
    metal : add POOL2D and fix IM2COL (#9943) b3964 Jun Hee Yoo 2024-10-23 19:33:45 +09:00
  • 5f4aef10ba Merge remote-tracking branch 'origin/master' into tool-call Olivier Chafik 2024-10-23 11:28:28 +01:00
  • 3c2b87df4b fix more formatting and enhance readability Junhee Yoo 2024-10-23 17:46:08 +09:00
  • 746e79e9a5 apply review Junhee Yoo 2024-10-23 17:00:17 +09:00
  • 63978cb6dc server: handle n_predict==2 error zhenweijin 2024-10-18 18:06:19 +08:00
  • bb9949b3f6 apply review: change kernel name of pool_2d Junhee Yoo 2024-10-23 14:59:30 +09:00
  • bd86c4c4df apply more optimization Junhee Yoo 2024-10-23 11:19:23 +09:00
  • 0084847991 apply suggestions Junhee Yoo 2024-10-23 11:18:39 +09:00
  • 2b49440011 tool-call: fix previous commit's parallel arg ochafik 2024-10-23 02:35:21 +01:00
  • 3e12b9b38e tool-calls: basic Nemo support, default parallel to true if template mentions tool_call_id ochafik 2024-10-23 02:30:31 +01:00
  • 873279b159 flake.lock: Update github-actions[bot] 2024-10-20 00:22:59 +00:00
  • bd4d1221d9
    Update ggml/src/ggml-sycl.cpp Neo Zhang Jianyu 2024-10-23 08:38:50 +08:00
  • 68c838e164
    Update ggml/src/ggml-sycl.cpp Neo Zhang Jianyu 2024-10-23 08:38:42 +08:00
  • fc80ad20ce tool-call: Log tool call style name, ensure returned content not null ochafik 2024-10-22 23:41:47 +01:00
  • a4f12a4594 minja: fix string subscripts, add string pipe to support Mistral-Nemo template ochafik 2024-10-22 23:39:46 +01:00
  • a279f17815 feat(convert_hf_to_gguf): support q4_0 and q4_1 quantifications pancake 2024-10-22 18:51:03 +02:00
  • 7d6ef40c19 minor Michael Coppola 2024-10-22 12:23:35 -04:00
  • 85cea66dbb renamed *_ghost_text to ghost_text_*, moved nvim/vim detection to llama#init() Michael Coppola 2024-10-22 12:10:35 -04:00
  • f946cbc42a renamed *_hlgroup to hlgroup_* Michael Coppola 2024-10-22 12:02:24 -04:00
  • 5292d45f99 vim ghost text rendering now uses pos_x and pos_y parameters Michael Coppola 2024-10-22 11:59:16 -04:00
  • cc961965e1 minor Michael Coppola 2024-10-22 11:48:01 -04:00
  • bf9c4ccdc5 unified fim_on_exit Michael Coppola 2024-10-22 11:45:24 -04:00
  • b39e65fa68 minor Michael Coppola 2024-10-22 11:32:48 -04:00
  • 39c3cd41d5 removed unused code Michael Coppola 2024-10-22 11:28:09 -04:00
  • c8c07d658a
    llama : fix empty batch causing llama_batch_allocr to crash (#9966) b3962 Xuan Son Nguyen 2024-10-22 16:59:02 +02:00
  • b81721b523 fixed ghost text indenting when expandtab is on Michael Coppola 2024-10-22 10:35:17 -04:00
  • 66a4cd1b60 Merge branch 'llama.vim_classic_vim' of github.com:m18coppola/llama.cpp into llama.vim_classic_vim Michael Coppola 2024-10-22 09:45:43 -04:00
  • e30b4c9d6a fixed job_start creating new scratch buffers Michael Coppola 2024-10-22 09:45:21 -04:00
  • d7d18bff22 fixed job_start creating new scratch buffers Michael Coppola 2024-10-22 09:39:11 -04:00
  • 351aecbe3f Update llama-sampling.cpp Olivier Chafik 2024-10-22 14:37:43 +01:00
  • db4bf93812 Merge remote-tracking branch 'origin/master' into tool-call Olivier Chafik 2024-10-22 14:37:30 +01:00
  • 19d900a756
    llama : rename batch to ubatch (#9950) b3961 Daniel Bevenius 2024-10-22 15:31:06 +02:00
  • f732003622
    Apply suggestions from code review Xuan Son Nguyen 2024-10-22 15:23:42 +02:00
  • 11d47057a5
    Rwkv chat template fix (#10001) b3960 Molly Sophia 2024-10-22 21:22:26 +08:00
  • 7dde288320
    Add files via upload Caleb Princewill Nwokocha 2024-10-22 07:50:24 -05:00
  • 1135b0f816
    Update src/llama.cpp Molly Sophia 2024-10-22 20:26:45 +08:00
  • facec466e8 converter: Add comment about the hack for rwkv models Molly Sophia 2024-10-22 19:27:21 +08:00
  • 543b1027aa llama: remove useless template matching for rwkv-world Molly Sophia 2024-10-22 19:26:41 +08:00
  • 11b1564efb add GGML_ASSERT Xuan Son Nguyen 2024-10-22 13:14:38 +02:00
  • 540c3016d8 fix build Xuan Son Nguyen 2024-10-22 13:10:40 +02:00
  • bf88cf87d9
    Add files via upload Caleb Princewill Nwokocha 2024-10-22 06:09:38 -05:00
  • c421ac072d
    lora : warn user if new token is added in the adapter (#9948) b3959 Xuan Son Nguyen 2024-10-22 13:08:41 +02:00
  • 6ab116ac5a move batch_allocr inside decode/encode_internal Xuan Son Nguyen 2024-10-22 13:01:22 +02:00
  • 7f2429e6b0 tool-calls: fix grammar regression Olivier Chafik 2024-10-22 11:49:50 +01:00
  • 4ff7fe1fb3
    llama : add chat template for RWKV-World + fix EOT (#9968) b3958 Molly Sophia 2024-10-22 18:33:37 +08:00
  • b53362a148 Update test-tool-call.cpp ochafik 2024-10-22 10:54:48 +01:00
  • 9f5ab97756 tool-calls: add generic tool call style as default ochafik 2024-10-22 10:53:21 +01:00
  • 0b9b5b42fb readme: add rwkv into supported model list Molly Sophia 2024-10-22 17:53:17 +08:00
  • fa8462ffd3 fix root ochafik 2024-10-22 10:53:01 +01:00
  • 75764871e6 tool-call: fix grammar roots ochafik 2024-10-22 10:50:52 +01:00
  • 6b8447352d
    [CANN] Adapt to dynamically loadable backends mechanism (#9970) b3957 leo-pony 2024-10-22 16:16:01 +08:00
  • 674804a996
    arg : fix typo in embeddings argument help [no ci] (#9994) Daniel Bevenius 2024-10-22 09:40:02 +02:00
  • 36ab274331 minor Michael Coppola 2024-10-22 03:10:19 -04:00
  • 4f6919ce2e minor Michael Coppola 2024-10-22 03:07:34 -04:00
  • edaa930f84 removed uneeded var Michael Coppola 2024-10-22 02:56:24 -04:00
  • 3402ab812f minor doc update Michael Coppola 2024-10-22 02:47:03 -04:00
  • 1713d34392 minor Michael Coppola 2024-10-22 02:36:26 -04:00
  • 09cbb43bcb minor Michael Coppola 2024-10-22 02:29:11 -04:00
  • 2dcbb00e06 fixed ring update, removed blank line Michael Coppola 2024-10-22 02:25:51 -04:00
  • 90e3bdc942 added classic vim support Michael Coppola 2024-10-22 02:17:08 -04:00
  • f888f6aa4e arg : fix typo in embeddings argument help Daniel Bevenius 2024-10-22 07:55:27 +02:00
  • d42b46bc85 Extend sgemm.cpp support for Q5_0 Srihari-mcw 2024-10-21 19:33:41 -07:00
  • abf5be4cfa Handle the review comments of this pull request leo-pony 2024-10-22 10:26:58 +08:00
  • e94a138d64
    llama.vim : fix info text display [no ci] (#9787) Georgi Gerganov 2024-10-22 00:35:25 +03:00
  • d3b183f743
    more clever way to exclude libm if needed 蕭澧邦 2024-10-22 05:23:45 +08:00
  • 4b19f970d2 minor changes Omar 2024-10-21 23:14:59 +03:00