Commit graph

  • 330e28df08
    Update common/common.cpp Pierrick Hymbert 2024-03-16 17:48:20 +01:00
  • 9565ae3187
    Update common/common.cpp Pierrick Hymbert 2024-03-16 17:48:10 +01:00
  • f22456d8c3
    Update common/common.cpp Pierrick Hymbert 2024-03-16 17:48:02 +01:00
  • b088122719
    Update common/common.cpp Pierrick Hymbert 2024-03-16 17:47:04 +01:00
  • f53bfd56af
    Update common/common.cpp Pierrick Hymbert 2024-03-16 17:46:53 +01:00
  • 8751bd0c82
    Update common/common.cpp Pierrick Hymbert 2024-03-16 17:46:46 +01:00
  • 4bc47b75ca
    Update common/common.cpp Pierrick Hymbert 2024-03-16 17:46:34 +01:00
  • e84206d132
    Update examples/server/README.md Pierrick Hymbert 2024-03-16 17:46:18 +01:00
  • 1430e895fc Merge branch 'master' into hp/download-model-from-hf Pierrick HYMBERT 2024-03-16 16:57:24 +01:00
  • c47cf414ef
    ggml : add AVX512F SIMD (#6088) b2447 AmirAli Mirian 2024-03-16 11:52:02 -04:00
  • 6633689fa5 llama_load_model_from_url: cleanup code Pierrick HYMBERT 2024-03-16 16:49:44 +01:00
  • b5f4ae09c3
    gritlm : add initial README.md (#6086) Daniel Bevenius 2024-03-16 16:46:29 +01:00
  • dfbfdd60f9
    readme : add wllama as a wasm binding (#6100) Xuan Son Nguyen 2024-03-16 16:42:08 +01:00
  • 15961ec04d
    common : refactor nested if causing error C1061 on MSVC (#6101) b2444 DAN™ 2024-03-16 11:39:15 -04:00
  • aa450bea4b Add flag to track found arguments. DAN™ 2024-03-16 11:24:06 -04:00
  • 921e4af930 ci: build, fix the default build to use LLAMA_CURL Pierrick HYMBERT 2024-03-16 16:29:02 +01:00
  • 5d99f3224f llama_load_model_from_url: download the file only if modified based on etag and last-modified http headers Pierrick HYMBERT 2024-03-16 16:27:06 +01:00
  • 4135d4a505 llama_load_model_from_url: typo Pierrick HYMBERT 2024-03-16 14:26:17 +01:00
  • 7a8a471321 Revert back and remove else's. DAN™ 2024-03-16 11:15:09 -04:00
  • 2c3a00e270
    Update Makefile Pierrick Hymbert 2024-03-16 15:40:29 +01:00
  • eac00a72d5
    Update ggml.c Julia Longtin 2024-03-16 14:17:21 +00:00
  • e216a2f133
    Update ggml.c Julia Longtin 2024-03-16 14:15:51 +00:00
  • 257ffd9955
    Update ggml.c Julia Longtin 2024-03-16 14:13:22 +00:00
  • 717e164dd7 implement F32 dot products. Julia Longtin 2024-03-16 14:05:03 +00:00
  • 0de79f7316 Refactor nested if causing error C1061 on MSVC. DAN™ 2024-03-16 10:00:46 -04:00
  • c0fe6298ae fix CUDA split buffers slaren 2024-03-15 21:35:30 +01:00
  • 3a774427ae code cleanup slaren 2024-03-15 17:43:53 +01:00
  • c2dba0450f fix hip slaren 2024-03-15 15:36:09 +01:00
  • 5b6b4ac235 backend : offload large batches to GPU slaren 2024-03-14 12:42:07 +01:00
  • 80bec9890a llama_load_model_from_url: try to make the windows build passing Pierrick HYMBERT 2024-03-16 14:08:21 +01:00
  • df0d82289c ci: compile the server with curl, add make option curl example in default cmake Pierrick HYMBERT 2024-03-16 13:52:17 +01:00
  • 7e782856bd common: LLAMA_USE_CURL in make toolchain Pierrick HYMBERT 2024-03-16 13:45:09 +01:00
  • 42b25dacab common: PR feedback, rename the definition to LLAMA_USE_CURL Pierrick HYMBERT 2024-03-16 13:27:05 +01:00
  • a56d09a440
    ci : close inactive issue with workflow (#6053) Pierrick Hymbert 2024-03-16 13:20:53 +01:00
  • f9ae8e093f ci: close issue, change workflow schedule time Pierrick HYMBERT 2024-03-16 13:17:35 +01:00
  • 22bdb33952 readme: add wllama binding ngxson 2024-03-16 13:13:27 +01:00
  • a0ebdfcc5d common: llama_load_model_from_url witch to libcurl dependency Pierrick HYMBERT 2024-03-16 11:32:29 +01:00
  • 391b17e7f6 json: support mix of additional props & required/optional ochafik 2024-03-16 11:13:29 +00:00
  • f30d6c27b9 json: simplify test ochafik 2024-03-16 10:35:41 +00:00
  • 3221ab01ad common: introduce llama_load_model_from_url to download model from hf url using libopenssl only Pierrick HYMBERT 2024-03-16 09:59:05 +01:00
  • 5abcbf42aa Fixed Windows MSVC Compilation Deacon 2024-03-15 23:21:56 -07:00
  • 5536360816 ggml:fix finding transfer queue family index error GainLee 2024-03-16 09:46:45 +08:00
  • 5602a8b649 Merge remote-tracking branch 'origin/master' into json-fixes ochafik 2024-03-16 00:45:07 +00:00
  • 842eb834c5 json: re-ran server deps.sh ochafik 2024-03-16 00:36:36 +00:00
  • af31aa20b4 Revamp test cmake to allow args (WORKING_DIRECTORY needed for JSON test) ochafik 2024-03-16 00:19:44 +00:00
  • 7dbc4c9c9b
    Merge 3f96f1c079 into d84c48505f Philipp Emanuel Weidmann 2024-03-15 17:11:14 -05:00
  • d84c48505f
    llama : fix Baichuan2 13B (#6092) slaren 2024-03-15 22:14:16 +01:00
  • 18e109d3f4 llama : fix Baichuan2 13B slaren 2024-03-15 22:02:53 +01:00
  • 877b4d0c62
    llama : add support for control vectors (#5970) Theia Vogel 2024-03-15 13:43:02 -07:00
  • 12247f4c69
    llama : add Command-R support (#6033) b2440 Andrew Canis 2024-03-15 16:41:22 -04:00
  • 838c99c7d5 disable control vector when data == nullptr Theia Vogel 2024-03-15 12:59:08 -07:00
  • ab7012d149 added AVX512F macros in ggml. amiralimi 2024-03-15 17:31:11 +00:00
  • 0fb77a40bd
    squash! gritlm: add initial README.md to examples/gritlm Daniel Bevenius 2024-03-15 18:18:16 +01:00
  • 07dd8e02b8
    Merge 8d0033ad63 into 4e9a7f7f7f Romain D 2024-03-15 16:07:07 +00:00
  • 38cfa8c044
    squash! gritlm: add initial README.md to examples/gritlm Daniel Bevenius 2024-03-15 16:38:24 +01:00
  • 4cdfa83a1b
    gritlm: add initial README.md to examples/gritlm Daniel Bevenius 2024-03-15 15:42:43 +01:00
  • 4e9a7f7f7f
    llava : change API to pure C style for Rust FFI bindgen (#6079) b2439 Ting Lou 2024-03-15 22:31:05 +08:00
  • 8557b21590 Add Command-R Model Andrew Canis 2024-03-13 01:54:34 -04:00
  • 64472c9e97
    Update index.html to make the chat box wider 0xez 2024-03-15 20:41:58 +08:00
  • 3020327f6c
    cuda : disable unused cudaLaunchHostFunc code (#6078) b2438 slaren 2024-03-15 13:24:03 +01:00
  • 71583d8adb
    Merge branch 'ggerganov:master' into master Ting Lou 2024-03-15 19:28:36 +08:00
  • c3c20efc58 cuda : disable unused cudaLaunchHostFunc code slaren 2024-03-15 12:15:25 +01:00
  • 46acb36767
    fix set main gpu error (#6073) b2437 Neo Zhang Jianyu 2024-03-15 18:53:53 +08:00
  • 5714487830 json: basic support for reserved names {number:{number:{root:number}}} ochafik 2024-03-15 10:35:34 +00:00
  • de7f00f312 ggml:fix finding transfer queue family error ligen 2024-03-15 18:33:16 +08:00
  • 5d238634ef llava : Change llava's API to pure C style to simplify Rust bindgen Lou Ting 2024-03-15 18:22:06 +08:00
  • daceced65e nit ochafik 2024-03-15 10:07:20 +00:00
  • 235ff6858d json: don't use c++20 designated initializers ochafik 2024-03-15 10:03:57 +00:00
  • 131b058409
    make : ggml-metal.o depends on ggml.h b2436 Georgi Gerganov 2024-03-15 11:36:50 +02:00
  • 753e36f650
    [SYCL] Fix non-intel device selection (#6042) b2435 AidanBeltonS 2024-03-15 09:26:20 +00:00
  • 7ce2c77f88
    gguf : add support for I64 and F64 arrays (#6062) b2434 Ondřej Čertík 2024-03-15 02:46:51 -06:00
  • aab606a11f
    llama : add Orion chat template (#6066) b2433 Xuan Son Nguyen 2024-03-15 09:44:57 +01:00
  • b0bc9f4a9d
    llama-bench : use random tokens to improve accuracy with mixtral (#6069) b2432 slaren 2024-03-15 09:22:24 +01:00
  • 87e5c86686 allow iq quant abhilash1910 2024-03-15 00:33:35 -07:00
  • 43c2c13685 resolve-conflicts simonJJJ 2024-03-15 15:25:54 +08:00
  • 56a38c46f5 support qwen2moe simonJJJ 2024-03-15 15:12:24 +08:00
  • 74efd7ebc2 fix set main gpu error Jianyu Zhang 2024-03-15 14:45:12 +08:00
  • baadd37fc3 llama : fix integer overflow during quantization (#6063) Georgi Gerganov 2024-03-14 22:58:41 +02:00
  • 0c3c10b0b3 gguf : fix resource leaks (#6061) Steve Grubb 2024-03-14 14:29:32 -04:00
  • 2c292751fb gguf-py : bump version to 0.8.0 (#6060) Ondřej Čertík 2024-03-14 11:57:31 -06:00
  • 72797a2a41 llama : support models without vocabulary (#5798) Michael Podvitskiy 2024-03-14 17:21:56 +01:00
  • d5dc5d2829 embedding : add EOS token if not present (#899) Georgi Gerganov 2024-03-14 15:14:14 +02:00
  • 42b03c4e4f gguf-py : fix dtype check (#6045) Georgi Gerganov 2024-03-14 13:32:14 +02:00
  • 51945cfa8d readme : improve readme for Llava-1.6 example (#6044) Jian Liao 2024-03-14 04:18:23 -07:00
  • 9d16ae7575 server: disable debug release type sanitizer, simplify trigger (#6047) Pierrick Hymbert 2024-03-14 12:15:39 +01:00
  • f36e0a5edc llama : fix typo Georgi Gerganov 2024-03-14 13:13:06 +02:00
  • d2651dd6ef llama : optimize defrag moves + fix fragmentation calculation (#6037) Michael Podvitskiy 2024-03-14 11:56:48 +01:00
  • 3d317f6b46 gguf-py : add support for I8, I16 and I32 (#6045) Ondřej Čertík 2024-03-14 04:40:14 -06:00
  • 58fd22796f ggml : designate enum vals for integer types (#6050) Georgi Gerganov 2024-03-14 12:38:37 +02:00
  • 42ff703854 embedding : print all resulting embeddings (#899) Georgi Gerganov 2024-03-14 12:37:20 +02:00
  • 5f34a219e1 metal : build metallib + fix embed path (#6015) Georgi Gerganov 2024-03-14 11:55:23 +02:00
  • 6bde412da2 embedding : print cosine similarity (#899) Georgi Gerganov 2024-03-14 10:12:29 +02:00
  • 8e86d5677d readme : update details about running llama in Termux on Android (#6039) Linwei Wang 2024-03-14 02:34:40 +08:00
  • dbde7d35fa readme : update API changes and hot topics Georgi Gerganov 2024-03-13 20:33:56 +02:00
  • 119accb98c grammar : handle missing "root" node (#6004) Clint Herron 2024-03-13 14:10:40 -04:00
  • 30fcee1e50 llama : add pipeline parallelism support (#6017) slaren 2024-03-13 18:54:21 +01:00
  • 2784b84506 test-backend-ops : skip CPU backend by default (#6028) slaren 2024-03-13 14:58:30 +01:00
  • 463ab3ed31 Update get version (#6025) AidanBeltonS 2024-03-13 13:17:54 +00:00
  • 42810ddfbd Server: Use multi-task for embeddings endpoint (#6001) Xuan Son Nguyen 2024-03-13 11:39:11 +01:00
  • d6625ce30a ci : remove tidy-review (#6021) slaren 2024-03-12 16:55:19 +01:00