Commit graph

  • 1585fec58b
    Update ggml.c bmwl 2024-02-15 07:13:00 -08:00
  • 4ffe18ee5e
    Update ggml.c bmwl 2024-02-15 07:12:24 -08:00
  • dc828c4556
    Update ggml.h bmwl 2024-02-15 07:11:28 -08:00
  • af50604c6e ggml : strict definition constantify for C compiler Herman Semenov 2024-02-15 18:05:16 +03:00
  • 4453456c5a
    llava: update link to vit config in README.md Daniel Bevenius 2024-02-15 15:41:37 +01:00
  • 113e0d5d1b
    cuda : fix performance (pow -> powf) Georgi Gerganov 2024-02-15 16:02:24 +02:00
  • 3aa20e7446 examples : constantify lambda variables Herman Semenov 2024-02-15 17:01:25 +03:00
  • c7f0639ae4 Remove unneccesary waits Aidan 2024-02-15 13:48:33 +00:00
  • 9350a1cf21
    scripts : add hf.sh helper script (#5501) Georgi Gerganov 2024-02-15 15:41:15 +02:00
  • 7798a9bb73
    llava: fix clip-model-is-vision flag in README.md Daniel Bevenius 2024-02-15 14:19:41 +01:00
  • 73122473ff
    fix(gguf-py): special tokens are no longer skipped when add_<token>_token is set to false (#5487) Michaël de Vries 2024-02-15 14:14:37 +01:00
  • e856bfed3b
    hf : add support for --repo and --file gg/hf Georgi Gerganov 2024-02-15 15:05:15 +02:00
  • e834aa1fd4
    hf : add error logs Georgi Gerganov 2024-02-15 14:59:12 +02:00
  • b2c055b8af
    ggml : fix pos ptr when no ALiBi Georgi Gerganov 2024-02-15 14:30:50 +02:00
  • e3d4b99a9e
    ggml : update deprecation message Georgi Gerganov 2024-02-15 14:18:37 +02:00
  • 8c7b9ee28c
    cuda : add multi-seq ALiBi + remote F16 soft_max Georgi Gerganov 2024-02-15 14:17:28 +02:00
  • 88c272a622 Update ggml_sycl_mul_mat_batched_sycl Aidan 2024-02-15 11:19:03 +00:00
  • 996f7f4ec5
    ggml : support multi-sequence ALiBi (Metal) Georgi Gerganov 2024-02-15 13:46:26 +02:00
  • 9d05e6a0aa Fix gemm_batch_impl Aidan 2024-02-15 11:11:19 +00:00
  • 0fe2d56001
    ggml : deprecate ggml_alibi Georgi Gerganov 2024-02-15 13:11:13 +02:00
  • 3b5dc11be8 fix(gguf-py): added missing cls and mask token ids to the gguf metadata Michaël de Vries 2024-02-15 08:49:13 +01:00
  • 0d4177126b
    llava : fix memory management bug (#5491) Elbios 2024-02-15 09:01:57 +01:00
  • 7930a8a6e8
    llaba : hotfix for llava-1.6 image number (#5495) John 2024-02-15 08:59:18 +01:00
  • 303da63442
    scripts : add hf.sh helper scripts Georgi Gerganov 2024-02-15 09:54:20 +02:00
  • ed749b8128 use correct type of pooling for embedding models Douglas Hanley 2024-02-14 23:46:50 -06:00
  • 704359e299
    vulkan: Find optimal memory type but with fallback (#5381) Neuman Vong 2024-02-15 17:11:15 +11:00
  • 6cc749e6f0 More feedback @0cc4m Neuman Vong 2024-02-15 09:12:41 +11:00
  • e237527feb Added #ifdefs for non-Linux OS that don't have cpu_set_t datatype root 2024-02-14 19:37:10 +00:00
  • 098ab943d8 Make it cleaner by checking size in batch free wrapper Elbios 2024-02-14 19:35:23 +01:00
  • 7fb5427813 Fix up some boolean vs enum comparisons root 2024-02-14 17:14:26 +00:00
  • 71fdd5474f bugfix image number John 2024-02-14 18:12:52 +01:00
  • a47bb697cb
    Merge branch 'ggerganov:master' into master bmwl 2024-02-14 08:53:54 -08:00
  • 9a4d128226 llava example fix for wide images James O'Leary 2024-02-14 11:50:06 -05:00
  • a0f8a93bf1
    cuda : add ALiBi support in ggml_soft_max_ext Georgi Gerganov 2024-02-14 18:29:24 +02:00
  • ea74ba9116 Fix memory management in llava and server code Elbios 2024-02-14 16:59:46 +01:00
  • 97d6a0cc06
    ggml : alternative ALiBi without extra tensor Georgi Gerganov 2024-02-14 17:37:48 +02:00
  • 594fca3fef
    readme : fix typo (#5490) Rune 2024-02-14 16:15:49 +01:00
  • 79d8209cf6
    Fixed typo in README.md Rune 2024-02-14 16:13:43 +01:00
  • 5261fb2dbe
    tests : do not use slope for large soft_max Georgi Gerganov 2024-02-14 16:54:54 +02:00
  • 69da57c00b
    ggml : handle all SRCs (do not break on first null) Georgi Gerganov 2024-02-14 16:51:01 +02:00
  • ccbb277f46
    llava : update README.md (#5489) John 2024-02-14 15:49:42 +01:00
  • 6955b4ef89
    Update examples/llava/README.md Georgi Gerganov 2024-02-14 16:49:28 +02:00
  • b4a52fe695
    Update README.md John 2024-02-14 15:36:21 +01:00
  • 1c73822803
    Update README.md John 2024-02-14 15:30:42 +01:00
  • 5055a0c990
    ggml : support alibi bias in ggml_soft_max_ext (CPU + Metal) Georgi Gerganov 2024-02-14 16:03:42 +02:00
  • 8994ac8c1a @0cc4m feedback Neuman Vong 2024-02-15 00:01:26 +11:00
  • f428652b2a fix(gguf-py): special tokens are no longer skipped when add_<token>_token is set to false Michaël de Vries 2024-02-14 14:05:57 +01:00
  • 6ca762eccf
    llama : reuse hparams.f_max_alibi_bias in all cases Georgi Gerganov 2024-02-14 13:54:55 +02:00
  • 7e0c3778fb
    ggml : avoid recomputing alibi slopes (CPU) Georgi Gerganov 2024-02-14 13:54:23 +02:00
  • c590bceaef
    Merge branch 'ggerganov:master' into master bmwl 2024-02-14 01:46:33 -08:00
  • 0fb40ae755 split numa init out from llama_backend_init and created llama_numa_init. Updated all code paths and samples root 2024-02-14 09:46:06 +00:00
  • 8084d55440
    cmake : ARM intrinsics detection for MSVC (#5401) Michael Podvitskiy 2024-02-14 11:49:01 +03:00
  • aa23412989
    llava : support v1.6 (#5267) John 2024-02-14 08:38:35 +01:00
  • 6727cfd21a
    llava : update readme a bit Georgi Gerganov 2024-02-14 09:35:57 +02:00
  • 7974ff7f02
    clip : minor code rearrange Georgi Gerganov 2024-02-14 09:34:16 +02:00
  • 394ba65621
    Merge 95a492a8c5 into f5ca054855 Riceball LEE 2024-02-14 01:46:22 -05:00
  • 0e05042b45
    Merge branch 'ggerganov:master' into master bmwl 2024-02-13 20:34:16 -08:00
  • c9874dd0d6
    bugfix for non llava-1.6 John 2024-02-14 05:05:57 +01:00
  • 9d04f3bd50
    Update unicode.h bobqianic 2024-02-14 00:49:19 +00:00
  • 3cd3964587
    Merge branch 'master' into master bobqianic 2024-02-14 00:44:30 +00:00
  • 49bba30999
    Merge pull request #1 from bobqianic/fix bobqianic 2024-02-14 00:39:58 +00:00
  • bb2e1e869d
    Update llama.cpp bobqianic 2024-02-14 00:39:22 +00:00
  • 69aaa249b3
    Add files via upload bobqianic 2024-02-14 00:35:25 +00:00
  • 4447e95ec5
    fix(server): infinite loop to inference Riceball LEE 2024-02-14 07:40:51 +08:00
  • f5ca054855
    Early return for zero size calls to get_tensor. (#5482) AT 2024-02-13 15:44:25 -06:00
  • 2d0586ecb0 Since we do the early return in the generic backend now no reason to do so here as well. Adam Treat 2024-02-13 16:42:44 -05:00
  • 43bfc95a5a Early return after the assertions. Adam Treat 2024-02-13 14:20:06 -05:00
  • 590e773940 Add an early return to the get/set tensor when the size is null. Adam Treat 2024-02-13 14:14:20 -05:00
  • 4ca3d401cd
    Update ggml-kompute.cpp AT 2024-02-13 14:01:19 -05:00
  • a83f68797d
    Update ggml-kompute.cpp AT 2024-02-13 14:01:14 -05:00
  • c92431a0a4
    server : remove clip structs Georgi Gerganov 2024-02-13 20:51:20 +02:00
  • 9d166b0850
    convert : add --skip-unknown CLI arg Georgi Gerganov 2024-02-13 20:43:45 +02:00
  • 89b1915de3 Early return for zero size calls to get_tensor. Adam Treat 2024-02-13 13:41:33 -05:00
  • 997dd1fdf7
    llava : style Georgi Gerganov 2024-02-13 20:40:01 +02:00
  • a20c071d93
    Merge remote-tracking branch 'origin/master' into HEAD Georgi Gerganov 2024-02-13 20:26:36 +02:00
  • 65ec518d41
    llava : fix compile warnings Georgi Gerganov 2024-02-13 20:22:28 +02:00
  • a2848854a4
    llava : update readme Georgi Gerganov 2024-02-13 19:59:00 +02:00
  • 6b8d69b451
    convert : skip unknown tensors (need for LLaVA) Georgi Gerganov 2024-02-13 19:58:44 +02:00
  • 6c00a06692
    gguf : add python reader example (#5216) John 2024-02-13 18:56:38 +01:00
  • ea9c8e1143
    llama : add support for Nomic Embed (#5468) b2144 Jared Van Bortel 2024-02-13 12:03:53 -05:00
  • ccd757a174 convert : fix mistakes from refactoring ceb/nomic-bert Jared Van Bortel 2024-02-13 11:59:11 -05:00
  • c2f407e398 cleanup convert-hf-to-gguf.py Jared Van Bortel 2024-02-12 17:35:56 -05:00
  • b8ff85efe0 convert : pad vocab size to multiple of 64, not 8 Jared Van Bortel 2024-02-12 16:47:00 -05:00
  • 48a7ef6ebc Nomic BERT Jared Van Bortel 2024-02-08 18:00:44 -05:00
  • c4e6dd59e4
    llama : allow raw byte in SPM vocabs; don't crash on nl 404 (#5478) b2143 Aarni Koskela 2024-02-13 18:18:16 +02:00
  • 72b353f555 common : llama_byte_to_token: allow falling back to finding just the token byte in SPM vocabs Aarni Koskela 2024-02-13 13:05:51 +02:00
  • 93aed7595b common : don't crash if newline token is not found Aarni Koskela 2024-02-13 12:20:33 +02:00
  • 5315a5a8e4
    lintlintlint John 2024-02-13 15:17:26 +01:00
  • c8c2e95069
    Update reader.py John 2024-02-13 15:08:36 +01:00
  • 037259be68
    llama : make load error reporting more granular (#5477) b2142 Aarni Koskela 2024-02-13 15:24:50 +02:00
  • 5c977221d2 iq1_s: slightly faster dot product ik/iq1_s Iwan Kawrakow 2024-02-13 15:18:27 +02:00
  • 263978904c
    finetune : rename feed-forward tensors (w1/w2/w3) (#4839) b2141 Daniel Bevenius 2024-02-13 14:15:42 +01:00
  • cf45252a7c
    tests : multi-thread the tokenizer tests (#5474) b2140 Georgi Gerganov 2024-02-13 15:14:22 +02:00
  • f604a17994 iq1_s: Tests Iwan Kawrakow 2024-02-13 15:11:23 +02:00
  • 425c6bbb6c iq1_s: Metal works, but quite slow Iwan Kawrakow 2024-02-13 14:37:16 +02:00
  • 020b548ec3 iq1_s: Metal basics Iwan Kawrakow 2024-02-13 14:16:30 +02:00
  • 03bf161eb6
    llama : support batched embeddings (#5466) b2139 Douglas Hanley 2024-02-13 06:06:58 -06:00
  • f4cccb7e0a
    llama : minor Georgi Gerganov 2024-02-13 14:06:20 +02:00
  • 39d370452c
    Merge branch 'master' into HEAD Georgi Gerganov 2024-02-13 13:59:07 +02:00
  • b650d4cbdf
    embd : minor improvements Georgi Gerganov 2024-02-13 13:52:50 +02:00