Commit graph

  • 1641c521fe
    Merge branch 'ggerganov:master' into iq2_s Abhilash Majumder 2024-03-18 19:31:56 +05:30
  • 658ea8abfb backend : set max split inputs to GGML_MAX_SRC slaren 2024-03-18 14:09:23 +01:00
  • 59222320f8 gguf-split: split and merge gguf files per tensor Pierrick HYMBERT 2024-03-18 00:00:40 +01:00
  • ac9ee6a4ad
    ci : disable stale issue messages (#6126) b2456 Georgi Gerganov 2024-03-18 13:45:38 +02:00
  • 4f6d1337ca
    ci : temporary disable sanitizer builds (#6128) b2455 Georgi Gerganov 2024-03-18 13:45:27 +02:00
  • e9da81310e Force create a release. Nicolas Patry 2024-03-18 12:29:23 +01:00
  • 30e3c9fbba Revisited Readme-sycl Ouadie EL FAROUKI 2024-03-18 11:10:02 +00:00
  • f666df2ffe Even less CI ? Nicolas Patry 2024-03-18 12:20:30 +01:00
  • c796b5c9d0 Less workflows. Nicolas Patry 2024-03-18 12:15:24 +01:00
  • 49dbc55894 Prebuild with metal support. Nicolas Patry 2024-03-18 12:10:03 +01:00
  • 2bf8d0f7c4
    backend : offload large batches to GPU (#6083) b2454 slaren 2024-03-18 11:03:04 +01:00
  • 798bca4101
    ci : temporary disable sanitizer builds Georgi Gerganov 2024-03-18 10:33:30 +02:00
  • 496bc79bc2
    common : tidy-up argument parsing (#6105) b2453 DAN™ 2024-03-18 04:27:44 -04:00
  • 9b03719ad7
    convert : add support for CamembertModel architecture (#6119) Thérence 2024-03-18 09:17:00 +01:00
  • 7fd8b8a907
    common : disable repeat penalties Georgi Gerganov 2024-03-18 10:08:57 +02:00
  • 3a6efdd03c
    convert : use f32 outtype for bf16 tensors (#6106) Romain D 2024-03-18 09:04:41 +01:00
  • 12bd14e1a1
    common : add static classifier Georgi Gerganov 2024-03-18 09:49:32 +02:00
  • cdeb8183c8
    Merge branch 'master' into HEAD Georgi Gerganov 2024-03-18 09:46:34 +02:00
  • 393a8b50ea
    common : minor Georgi Gerganov 2024-03-18 09:41:14 +02:00
  • 4003ddf0ce
    ci : disable stale issue messages Georgi Gerganov 2024-03-18 09:18:08 +02:00
  • bd96df4e85 json: ws nit ochafik 2024-03-18 04:42:25 +00:00
  • 6bf7f3f41c ggml : do not multi-thread ops returning empty tensors Francis Couture-Harpin 2024-03-18 00:35:03 -04:00
  • 99c37ccb6b ggml : saner ggml_can_repeat with empty tensors Francis Couture-Harpin 2024-03-17 23:23:30 -04:00
  • 24f0b941cf json: fix string patterns (was missing quotes) ochafik 2024-03-18 04:06:23 +00:00
  • d100502251 llama : keep same graph topology even when n_outputs == 0 Francis Couture-Harpin 2024-03-17 22:04:42 -04:00
  • dd922a4da3 json: test/fix additional props corner cases ochafik 2024-03-18 01:32:15 +00:00
  • 711b0bcb11 llama : fix running a batch with n_outputs == 0 Francis Couture-Harpin 2024-03-17 20:41:21 -04:00
  • bbd70800c8 json: improve grammar parsing failures ochafik 2024-03-18 00:34:02 +00:00
  • a57fa7faa4 llama : fix not-skipping outputs of non-causal models Francis Couture-Harpin 2024-03-17 20:19:25 -04:00
  • 618247885c json: test/fix top-level anyOf ochafik 2024-03-18 00:13:58 +00:00
  • 20869ede26 Merge remote-tracking branch 'origin/master' into json-fixes ochafik 2024-03-17 22:53:04 +00:00
  • edbd2e9862 json: add server tests for OAI JSON response_format ochafik 2024-03-17 22:51:29 +00:00
  • 3e1bf44e5e json: check parsing in test + fix value & string refs ochafik 2024-03-17 22:47:20 +00:00
  • 84e383c1d7 json: test (& simplify output of) empty schema ochafik 2024-03-17 21:51:10 +00:00
  • ab6f3a8a8d
    Update ggml-phi-knc.c Julia Longtin 2024-03-17 21:36:14 +00:00
  • f882673ba6 add a benchmark / test binary. Julia Longtin 2024-03-17 21:20:14 +00:00
  • fe663c1b63 merge from upstream Julia Longtin 2024-03-17 21:15:32 +00:00
  • e19cb3aeb7 llama : fix wrong n_outputs in llama_set_inputs Francis Couture-Harpin 2024-03-17 16:31:19 -04:00
  • 5c50ffaeac json: fix type=const in c++, add failure expectations for non-str const&enum ochafik 2024-03-17 21:03:48 +00:00
  • 64799baea1 json: add tests for some expected failures ochafik 2024-03-17 21:01:02 +00:00
  • 7db32e531d
    Merge pull request #1 from Royalphax/Royalphax-patch-1 Thérence 2024-03-17 21:15:09 +01:00
  • f31d128159
    Add support for CamembertModel architecture Thérence 2024-03-17 21:09:43 +01:00
  • 408fcb0f91 llama : fix llama_get_embeddings_ith when the resulting id is 0 Francis Couture-Harpin 2024-03-17 15:34:56 -04:00
  • 487f89ec2e llama : fix embedding conditions Francis Couture-Harpin 2024-03-17 15:23:44 -04:00
  • d0129e8e29 perplexity : normalize spaces and punctuation in Winogrande sentences Francis Couture-Harpin 2024-03-17 14:54:09 -04:00
  • 17b45c96ed perplexity : fix Winogrande, use correct logits for second choice start Francis Couture-Harpin 2024-03-16 22:05:44 -04:00
  • 25981fca37 perplexity : adapt to the logits API changes Francis Couture-Harpin 2024-03-16 21:36:48 -04:00
  • 705d3937ea llama : fix lctx.n_outputs not being set before building graph Francis Couture-Harpin 2024-03-16 17:24:05 -04:00
  • 98914c0ed0 llama : more compact state saving and reloading Francis Couture-Harpin 2024-03-15 12:21:24 -04:00
  • 1fd1918bdc llama : greatly reduce logits memory usage Francis Couture-Harpin 2024-03-15 00:46:34 -04:00
  • d01b3c4c32
    common: llama_load_model_from_url using --model-url (#6098) b2450 Pierrick Hymbert 2024-03-17 19:12:37 +01:00
  • cb6636e0de Add k-quant mul mat mat shaders 0cc4m 2024-03-17 18:52:25 +01:00
  • cd776c37c9
    ci : close all stale issues at once (#6115) b2449 Georgi Gerganov 2024-03-17 19:51:57 +02:00
  • 9445ef816d
    ci : close all stale issues at once Georgi Gerganov 2024-03-17 19:49:01 +02:00
  • cc9299ce19 update backends slaren 2024-03-17 14:55:35 +01:00
  • dc0f612548
    ggml:fix finding transfer queue family index error (#6094) b2448 GainLee 2024-03-18 01:12:22 +08:00
  • fcf327f0e6 ci: tests: fix behavior on windows Pierrick HYMBERT 2024-03-17 17:45:09 +01:00
  • b24f30fdad common: llama_load_model_from_url delete previous file before downloading Pierrick HYMBERT 2024-03-17 16:52:38 +01:00
  • f902ab6de2 common: llama_load_model_from_url use a temporary file for downloading Pierrick HYMBERT 2024-03-17 16:37:02 +01:00
  • 31272c635a common: fix typo Pierrick HYMBERT 2024-03-17 16:46:53 +01:00
  • 47a9e5d76c ci: tests: increase timeout for windows Pierrick HYMBERT 2024-03-17 16:37:40 +01:00
  • 4fe431d429 common: llama_load_model_from_url: make it working on windows: disable global curl function, use a write callback. Pierrick HYMBERT 2024-03-17 16:31:34 +01:00
  • cff7faaccb ci: tests: print server logs in case of scenario failure Pierrick HYMBERT 2024-03-17 16:28:01 +01:00
  • 0661e6a1ae sched : add a new split if the current one has too many inputs reduce max inputs per split more cleanup slaren 2024-03-16 20:28:22 +01:00
  • ca7a2f81b3
    Merge 72a9f4ea8c into c47cf414ef Yavor Ivanov 2024-03-17 03:14:42 -07:00
  • c1b002e067 common: llama_load_model_from_url windows set CURLOPT_SSL_OPTIONS, CURLSSLOPT_NATIVE_CA Pierrick HYMBERT 2024-03-17 09:35:19 +01:00
  • 5c7970bf45 flake.lock: Update github-actions[bot] 2024-03-17 06:37:44 +00:00
  • f3a3ea1ff3
    Merge branch 'ggerganov:master' into iq2_s Abhilash Majumder 2024-03-17 11:23:09 +05:30
  • 9ca4acc5fb common: fix windows tests Pierrick HYMBERT 2024-03-17 02:30:20 +01:00
  • 5e66ec80b3 common: fix windows tests Pierrick HYMBERT 2024-03-17 02:07:06 +01:00
  • a3ed3d48d3 common: fix windows build Pierrick HYMBERT 2024-03-17 01:17:58 +01:00
  • 73b4b44785 common: fix build Pierrick HYMBERT 2024-03-17 00:43:35 +01:00
  • 1ddaf7109a common: remove old dependency to openssl Pierrick HYMBERT 2024-03-16 22:43:05 +01:00
  • 13d8817ce2 ci: build: try to fix the windows build Pierrick HYMBERT 2024-03-16 22:34:01 +01:00
  • 89d3483860 ci: build: fix ubuntu-focal-make-curl Pierrick HYMBERT 2024-03-16 22:27:02 +01:00
  • 9da4eec082 llama_load_model_from_url: minor spacing and log message changes Pierrick HYMBERT 2024-03-16 22:13:46 +01:00
  • dbd969142e build: move the make build with env LLAMA_CURL to a dedicated place Pierrick HYMBERT 2024-03-16 22:01:19 +01:00
  • d81acb6847 build: introduce cmake option LLAMA_CURL to trigger libcurl linking to be coherent with the make toolchain Pierrick HYMBERT 2024-03-16 21:59:53 +01:00
  • e6848ab0e6 build: move the make build with env LLAMA_CURL to a dedicated place Pierrick HYMBERT 2024-03-16 21:53:07 +01:00
  • 22b3bb3ceb common: fix windows build caused by double windows.h import Pierrick HYMBERT 2024-03-16 21:50:37 +01:00
  • 6568836659
    convert : use f32 outtype for bf16 tensors Romain “Artefact2” Dal Maso 2024-03-16 21:36:06 +01:00
  • 078a67b04b
    Merge branch 'master' into master StrangeBytesDev 2024-03-16 13:02:27 -07:00
  • 1ad5a45210 ci: build: add libcurl in default make toolchain step for tests Pierrick HYMBERT 2024-03-16 20:06:18 +01:00
  • 78812c6d63 llama_load_model_from_url: PR feedback, use snprintf instead of strncp and strncat Pierrick HYMBERT 2024-03-16 20:02:34 +01:00
  • 980907595f imatrix : remove sched affix from weight names slaren 2024-03-16 19:58:53 +01:00
  • 9cba8a183d cuda : fix memset without set_device slaren 2024-03-16 18:37:57 +01:00
  • 5df5605b02 ci: build: add libcurl in default make toolchain step Pierrick HYMBERT 2024-03-16 19:52:11 +01:00
  • 176f039a91 ci: tests: windows tests add libcurl Pierrick HYMBERT 2024-03-16 19:51:44 +01:00
  • 8e717e8cb8
    Update ggml-backend-impl.h slaren 2024-03-16 18:48:45 +01:00
  • 838178a196 ci: tests: windows tests add libcurl Pierrick HYMBERT 2024-03-16 18:34:53 +01:00
  • 064dc076bb common: CMakeLists.txt fix typo in logging when lib curl is not found Pierrick HYMBERT 2024-03-16 18:34:36 +01:00
  • a63dbc3497 Missing ref. DAN™ 2024-03-16 13:29:29 -04:00
  • 124c474bba llama_load_model_from_url: coherent clearer logging Pierrick HYMBERT 2024-03-16 18:24:21 +01:00
  • 4fadb072e9 server: tests: add --model-url tests Pierrick HYMBERT 2024-03-16 18:15:20 +01:00
  • 545fef6e0e llama_load_model_from_url: fix compilation warning, clearer logging Pierrick HYMBERT 2024-03-16 18:01:55 +01:00
  • a132168d6d Tidy-up argument parsing. DAN™ 2024-03-16 13:00:43 -04:00
  • b0b49e0bb8
    Update examples/main/README.md Pierrick Hymbert 2024-03-16 17:48:48 +01:00
  • eb9e52a218
    Update common/common.cpp Pierrick Hymbert 2024-03-16 17:48:38 +01:00
  • be561a7ffd
    Update common/common.cpp Pierrick Hymbert 2024-03-16 17:48:32 +01:00
  • 89ab37a261
    Update common/common.cpp Pierrick Hymbert 2024-03-16 17:48:27 +01:00