Commit graph

  • 9722cfd0bb remove crash mike dupont 2023-11-24 09:35:08 -05:00
  • b9276f3b59 Use mmap in torch load, prefer .bin files when loading Galunid 2023-11-24 15:19:46 +01:00
  • 70f836e267 demonstrate crash mike dupont 2023-11-24 09:16:51 -05:00
  • 189d68446e
    convert : fix tensors using grad in some models (#4173) Galunid 2023-11-24 15:02:49 +01:00
  • 08e7afacf7 clip: enable CUDA backend FSSRepo 2023-11-24 08:39:50 -05:00
  • ebea708561 adding new header for llama internal mike dupont 2023-11-24 08:31:12 -05:00
  • b61631426b
    server : change random string generator Georgi Gerganov 2023-11-24 11:39:03 +02:00
  • b3e88bf494
    server : minor code style Georgi Gerganov 2023-11-24 11:33:49 +02:00
  • 2568a4bf54
    main.swift : fix eos checking (#4197) b1557 eastriver 2023-11-24 18:25:10 +09:00
  • c544faed74
    server : enable special tokens during tokenization by default Georgi Gerganov 2023-11-24 11:10:23 +02:00
  • b94b10914c
    server : indentation Georgi Gerganov 2023-11-24 11:00:15 +02:00
  • 80724eb0e1
    Merge branch 'master' into server-oai-compat Georgi Gerganov 2023-11-24 10:54:08 +02:00
  • f25308be5c
    server : some style changes Georgi Gerganov 2023-11-24 10:49:08 +02:00
  • 33cda86666
    main.swift: fix eos checking eastriver 2023-11-24 17:45:33 +09:00
  • a192910cb6 decode Robert Washbourne 2023-11-24 03:07:55 -05:00
  • b35f3d0def
    readme : use PATH for Windows ROCm (#4195) Aaryaman Vasishta 2023-11-24 16:52:39 +09:00
  • 7e8607d097 fix handler Robert Washbourne 2023-11-24 02:22:51 -05:00
  • 7e06600b38 disable Robert Washbourne 2023-11-24 01:58:58 -05:00
  • 1cde43fbb4 disable Robert Washbourne 2023-11-24 01:56:36 -05:00
  • 7280bb217a make Robert Washbourne 2023-11-24 01:54:21 -05:00
  • af5a4371da fix pip Robert Washbourne 2023-11-24 01:50:07 -05:00
  • 7be0016d40
    Update README.md Aaryaman Vasishta 2023-11-24 15:35:46 +09:00
  • d56bd40563
    Update README.md Aaryaman Vasishta 2023-11-24 15:35:03 +09:00
  • fc5999b130
    Update README.md to use PATH for Windows ROCm Aaryaman Vasishta 2023-11-24 15:34:28 +09:00
  • 63961c0e75 copy handler Robert Washbourne 2023-11-24 01:33:35 -05:00
  • eb42c73953 revert auto rope scaling for already-ropetuned models - just use their values Concedo 2023-11-24 14:20:36 +08:00
  • c2ad2b02f3 微调脚本优化 supermy 2023-11-24 14:18:01 +08:00
  • 9036005e51 syntax Robert Washbourne 2023-11-24 01:08:27 -05:00
  • 0507037432 move python Robert Washbourne 2023-11-24 01:05:31 -05:00
  • f571ed512a ws Robert Washbourne 2023-11-24 00:43:37 -05:00
  • 4468d96aec add handler Robert Washbourne 2023-11-24 00:41:33 -05:00
  • 2da27621a7 llama : updates from code review crasm 2023-11-24 00:25:32 -05:00
  • 8ddd5cb916 change prefix Robert Washbourne 2023-11-24 00:17:06 -05:00
  • 819d9f1258 cuda Robert Washbourne 2023-11-24 00:10:14 -05:00
  • 0e2c422b11 remove flags Robert Washbourne 2023-11-23 23:58:40 -05:00
  • 22804439d2 make Robert Washbourne 2023-11-23 23:51:34 -05:00
  • 4b6e344bad pip later Robert Washbourne 2023-11-23 23:43:23 -05:00
  • 93f86c98d1
    Merge branch 'ggerganov:master' into master Robert Washbourne 2023-11-23 23:36:09 -05:00
  • c06162ba94 from build Robert Washbourne 2023-11-23 23:30:56 -05:00
  • 1b703db0e1 change entrypoint Robert Washbourne 2023-11-23 13:31:30 -05:00
  • dc1e34abf2 Merge branch 'master' into feat-seqrep-sampler-simple KerfuffleV2 2023-11-23 17:10:20 -07:00
  • 9ae88baf38 Merge remote-tracking branch 'upstream/master' into nomic-vulkan-redo Jared Van Bortel 2023-11-23 13:05:04 -05:00
  • a4bb9c5ced vulkan : sync with "migrate to dynamic graphs" Jared Van Bortel 2023-11-23 12:20:07 -05:00
  • 23f6d51f68 Merge commit '4760e7cc0b' into nomic-vulkan Jared Van Bortel 2023-11-23 12:12:38 -05:00
  • 208cd52f7d vulkan : implement YaRN RoPE scaling (#2268) Jared Van Bortel 2023-11-15 17:58:19 -05:00
  • 1829f1d7be Merge commit '4760e7cc0b~' into nomic-vulkan Jared Van Bortel 2023-11-23 11:45:46 -05:00
  • 02c3309f6d merge fixup (e16b9fa4ba) Jared Van Bortel 2023-11-14 15:54:26 -05:00
  • 9c4dfd06e8 mention skipped change Jared Van Bortel 2023-11-15 15:51:55 -05:00
  • fe26e6adff Merge commit 'e16b9fa4ba' into nomic-vulkan Jared Van Bortel 2023-11-14 13:55:30 -05:00
  • 6474fc879a vulkan : handle ggml_scale for n%8 != 0 Jared Van Bortel 2023-11-14 12:10:52 -05:00
  • 2a41ba7258 Merge commit '469c9addef' into nomic-vulkan Jared Van Bortel 2023-11-14 12:00:37 -05:00
  • a934b2cb8a vulkan : assert various kernel requirements Jared Van Bortel 2023-11-14 11:59:58 -05:00
  • f194e1b6a6 Merge commit 'fcca0a7004' into nomic-vulkan Jared Van Bortel 2023-11-23 13:12:32 -05:00
  • 39abedd1d7 vulkan : optimize workgroup sizes Jared Van Bortel 2023-11-23 17:18:48 -05:00
  • 84f7fc4553 vulkan : rope n_past is now KQ_pos, f16 rope kernel Jared Van Bortel 2023-11-23 17:18:42 -05:00
  • 71565eb0c3 vulkan : replace ggml_diag_mask_inf with ggml_add (custom -inf mask) Jared Van Bortel 2023-11-23 17:18:27 -05:00
  • 55978ce09b
    Fix incorrect format strings and uninitialized variables. (#4133) b1555 Haohui Mai 2023-11-23 13:56:53 -08:00
  • e34fffc77b now has a model mike dupont 2023-11-23 16:32:46 -05:00
  • 989c85f986 llama : fix doc for yarn_ext_factor unspecified value crasm 2023-11-23 15:16:11 -05:00
  • f6250b8df4 convert.py : make script executable crasm 2023-11-23 15:15:53 -05:00
  • 8a8859ced4 update mike dupont 2023-11-23 15:16:41 -05:00
  • 90d6f11f66 refl now working, not on pointers but on the types mike dupont 2023-11-23 14:08:17 -05:00
  • 0a21ad6e3f Merge branch 'master' into x0rsh1ft X0RSH1FT 2023-11-23 12:17:06 -05:00
  • 6b0a7420d0
    llama : KV cache view API + better KV cache management (#4170) b1554 Georgi Gerganov 2023-11-23 19:07:56 +02:00
  • f8e9f11428
    common : add -dkvc arg for enabling kv cache dumps kv-cache-opts Georgi Gerganov 2023-11-23 18:47:56 +02:00
  • 5df7d06c42
    llama : allow exporting a view of the KV cache (#4180) Kerfuffle 2023-11-23 09:31:20 -07:00
  • aa21e6dbc2 Add doc comments for KV cache view functions KerfuffleV2 2023-11-23 08:06:51 -07:00
  • df647db611 bindings mike dupont 2023-11-23 09:55:19 -05:00
  • a08640c00d adding binding generator mike dupont 2023-11-23 09:55:07 -05:00
  • bc1c346ae8 Fix off by one error in dump_kv_cache_view KerfuffleV2 2023-11-23 07:44:10 -07:00
  • 7688d7204f Fix max contiguous empty cells index calculation KerfuffleV2 2023-11-23 07:21:19 -07:00
  • 22d0485a7a Track max contiguous cells value and position as well KerfuffleV2 2023-11-23 05:36:41 -07:00
  • 507c0674c1 adding the print module with type information mike dupont 2023-11-23 07:09:34 -05:00
  • d103d935c0
    readme : update hot topics Georgi Gerganov 2023-11-23 13:51:22 +02:00
  • 9d5949f04b
    examples : fix typo in parallel example doc comment (#4181) b1552 Daniel Bevenius 2023-11-23 12:34:20 +01:00
  • cb137d8bfc Allow dumping the sequences per cell in common KerfuffleV2 2023-11-23 04:11:00 -07:00
  • 2c97ce073e 模型下载;从零训练;微调;脚本优化 supermy 2023-11-23 19:09:51 +08:00
  • 9008906b2d
    examples: fix typo in parallel example doc comment Daniel Bevenius 2023-11-23 11:57:32 +01:00
  • 71fcb7e27e Allow exporting a view of the KV cache KerfuffleV2 2023-11-23 03:30:08 -07:00
  • ff8238f71d
    docs : add llama-star arch idea Georgi Gerganov 2023-11-23 11:35:04 +02:00
  • 7209a6ae0b Fix n_ctx issue for Baichuan & Baichuan2 13B model caiyesd 2023-11-23 10:46:58 +08:00
  • 625954985c Add arch argument for convert.py caiyesd 2023-11-23 10:30:43 +08:00
  • e1516709f2
    Fix server.cpp code style according to review kir-gadjello 2023-11-22 22:35:57 -03:00
  • 1e65f66c30 fix for loop conditionals, increase result size Bailey Chittle 2023-11-22 16:35:17 -08:00
  • 28085f535e Use torch.inference_mode Galunid 2023-11-23 00:58:05 +01:00
  • c683f2c76a debugging mike dupont 2023-11-22 17:43:16 -05:00
  • b598cf84fa compiling and running mike dupont 2023-11-22 16:46:32 -05:00
  • 436253f5a4 convert : fix tensors using grad in some models Galunid 2023-11-22 21:31:30 +01:00
  • f16b338e04
    Update convert-image-encoder-to-gguf.py John 2023-11-22 19:30:25 +01:00
  • ab8b6f15c5
    ShareGPT4 compatibility (vision encoder only loading) John 2023-11-22 19:22:17 +01:00
  • 671f639c59
    llama : zero KV cache used upon clear Georgi Gerganov 2023-11-22 19:30:48 +02:00
  • 09a1f053e7 now working mike dupont 2023-11-22 12:01:26 -05:00
  • ef4c0f572b moving to using refl-cpp for llama as well mike dupont 2023-11-22 11:40:25 -05:00
  • 9216e7beba Add the missing include statement Haohui Mai 2023-11-22 08:16:34 -08:00
  • 79cb8f0040
    llama : keep track of used KV cells + better KV cache management Georgi Gerganov 2023-11-22 17:16:57 +02:00
  • 6fd690fae7 running mike dupont 2023-11-22 09:04:00 -05:00
  • 74d80a8862 Merge branch 'master' into convert_hf_vocab strutive07 2023-11-22 11:20:32 +00:00
  • 5ac1949fff
    change funtion name wonjun Jang 2023-11-22 19:54:04 +09:00
  • 9ad4d273e1
    Improve server README.md kir-gadjello 2023-11-22 04:17:12 -03:00
  • af4d68b22d
    Update server README.md kir-gadjello 2023-11-22 03:55:23 -03:00