Commit graph

  • ec21fa7712 Merge branch 'master' into concedo_experimental Concedo 2023-12-08 17:42:26 +08:00
  • 930cdfb1ce updated lite, added patch that links to noscript mode Concedo 2023-12-08 16:53:30 +08:00
  • ac69c93ca9 linking mike dupont 2023-12-07 18:40:52 -05:00
  • 4c32117994 Update temp max kalomaze 2023-12-07 17:05:27 -06:00
  • fc2d49df1c
    Merge branch 'ggerganov:master' into master WillCorticesAI 2023-12-07 17:09:00 -05:00
  • 92887c4684 clang-format the llama.* files Will Findley 2023-12-07 17:07:29 -05:00
  • 9dabc9066b Set a more typical Top P setting as the default kalomaze 2023-12-07 15:06:35 -06:00
  • 115a9218eb Split sampling into the helper function (?) kalomaze 2023-12-07 14:58:48 -06:00
  • fe680e3d10
    sync : ggml (new ops, tests, backend, etc.) (#4359) b1620 Georgi Gerganov 2023-12-07 22:26:54 +02:00
  • 1e5830543e
    metal : print resource path Georgi Gerganov 2023-12-07 21:11:24 +02:00
  • 847d806693
    metal : fix assert Georgi Gerganov 2023-12-07 20:31:37 +02:00
  • 670fb48b80
    metal : fix "supports family" call Georgi Gerganov 2023-12-07 20:29:28 +02:00
  • 58265e5b09
    ci : disable Metal for macOS cmake build Georgi Gerganov 2023-12-07 20:06:48 +02:00
  • d050bbe831
    ggml-backend : remove backend self-registration Georgi Gerganov 2023-12-07 19:58:12 +02:00
  • ba03f9ca15
    ci : try to fix macOS Georgi Gerganov 2023-12-07 19:49:01 +02:00
  • f2e8616d1f
    ggml : restore ggml_get_n_tasks() logic in ggml_graph_plan() Georgi Gerganov 2023-12-07 19:07:27 +02:00
  • 3d154ad283
    ggml : fix bug in ggml_concat Georgi Gerganov 2023-12-07 19:06:48 +02:00
  • e14c08acc3
    Update README.md Kamil Tomšík 2023-12-07 16:47:08 +01:00
  • 6f01e9ecf7
    tests : add test-backend-ops Georgi Gerganov 2023-12-07 17:03:28 +02:00
  • 209d7e6522 Chose seemingly safer option MaggotHATE 2023-12-07 19:19:48 +05:00
  • ad98993d29
    cuda : remove assert for rope Georgi Gerganov 2023-12-07 14:53:06 +02:00
  • 5c63815e9c
    Revert "cmake : enable separable compilation for CUDA" Georgi Gerganov 2023-12-07 14:46:00 +02:00
  • 549050ad95 Reworked into separate functions and a commandline, wip MaggotHATE 2023-12-07 17:44:39 +05:00
  • 6963441b22 ggml-cuda : remove device side dequantize slaren 2023-12-07 13:42:02 +01:00
  • 130aaf836c
    Merge branch 'ggerganov:master' into common_json MaggotHATE 2023-12-07 17:31:26 +05:00
  • 09e35d04b1
    cmake : enable separable compilation for CUDA Georgi Gerganov 2023-12-07 14:25:24 +02:00
  • 06b5c623ca
    cuda : restore lost changes (StableLM rope) Georgi Gerganov 2023-12-07 14:25:02 +02:00
  • c6b3d195b8
    cuda : restore lost changes Georgi Gerganov 2023-12-07 14:17:39 +02:00
  • 213a4e2c32
    ggml : build fixes Georgi Gerganov 2023-12-07 13:52:27 +02:00
  • 95e9d8a780
    sync : ggml (part 3, Metal) Georgi Gerganov 2023-12-07 13:41:33 +02:00
  • 6b1cf54197
    sync : ggml (part 2, CUDA) Georgi Gerganov 2023-12-07 13:32:54 +02:00
  • 8bad78b8e2
    sync : ggml (part 1) Georgi Gerganov 2023-12-07 13:18:36 +02:00
  • bcc0eb4591
    llama : per-layer KV cache + quantum K cache (#4309) b1619 Georgi Gerganov 2023-12-07 13:03:17 +02:00
  • fc5f334689
    readme : add API change notice gg/per-layer-kv Georgi Gerganov 2023-12-07 12:35:02 +02:00
  • 680a99e792
    Merge branch 'master' into gg/per-layer-kv Georgi Gerganov 2023-12-07 12:33:11 +02:00
  • 81bc9214a3
    train : fix #4227 (double free in examples/train-text-from-scratch/train-text-from-scratch.cpp) (#4351) b1618 Hongyu Ouyang 2023-12-07 02:25:22 -08:00
  • 09279c86ce
    Fix typos in code. Richard Kiss 2023-12-06 21:31:10 -08:00
  • 27038a22d0
    Fix #4227 (double free in examples/train-text-from-scratch/train-text-from-scratch.cpp) casavaca 2023-12-06 15:42:40 -08:00
  • 31106553db
    Update convert-image-encoder-to-gguf.py John 2023-12-07 00:04:08 +01:00
  • 05cd6e5036
    server : recognize cache_prompt parameter in OAI API (#4347) b1617 Georgi Gerganov 2023-12-06 20:21:59 +02:00
  • 6f3fb01ffb Fixes and clearing memory MaggotHATE 2023-12-06 22:37:24 +05:00
  • ff38d7f9a5 Merge branch 'master' of github.com:ggerganov/llama.cpp Laura 2023-12-06 18:14:14 +01:00
  • c7511526a2 noscript mode is done Concedo 2023-12-07 00:52:25 +08:00
  • d6244ff813 adding missing files mike dupont 2023-12-06 10:05:12 -05:00
  • 7eb27b3443 now it is letting the llm control the output mike dupont 2023-12-06 10:03:45 -05:00
  • 7972929a3b now getting response from python mike dupont 2023-12-06 09:37:04 -05:00
  • e798e30cf2 Simplify reading MaggotHATE 2023-12-06 19:06:58 +05:00
  • 872d004b64 Simplified file reading, fixed an oversight MaggotHATE 2023-12-06 18:41:25 +05:00
  • 96b6806e1c Design and style fixes, better to not decide for users MaggotHATE 2023-12-06 17:56:28 +05:00
  • 1c861466dc working calling python mike dupont 2023-12-06 07:26:30 -05:00
  • 2f3ea04010 starting boost mike dupont 2023-12-06 07:20:19 -05:00
  • 8701b58276 Json integration into common (parameters parsing) MaggotHATE 2023-12-06 17:13:13 +05:00
  • 1a1a1c3845
    llama : support quantum K cache (#4312) Georgi Gerganov 2023-12-06 13:30:20 +02:00
  • 12002d8ed6 very basic noscript mode Concedo 2023-12-06 17:51:08 +08:00
  • caa9249217
    common : fix compile warning b1616 Georgi Gerganov 2023-12-06 10:41:03 +02:00
  • ef455cb1e0
    server : recognize cache_prompt parameter in OAI API Georgi Gerganov 2023-12-06 10:28:31 +02:00
  • da5eaef1f3
    speculative : support --color (#4343) b1615 stduhpf 2023-12-06 09:08:17 +01:00
  • b2e8a2e22f
    minor : add braces Georgi Gerganov 2023-12-06 10:04:49 +02:00
  • 8f665eaddc speculative: add some colors Stéphane du Hamel 2023-12-06 01:59:05 +01:00
  • 5f6e0c0dff
    grammar : pre-computed pieces + reserve mem + less string copies (#4330) b1614 Marcus Dunn 2023-12-05 10:55:12 -10:00
  • b629ede6b3 changed decode_utf8 to take src by ref marcus 2023-12-05 10:58:24 -08:00
  • 66a8dd35a0
    Merge branch 'master' into cuda-cublas-opts Georgi Gerganov 2023-12-05 20:54:33 +02:00
  • 5aa365d88f
    llama : allow overriding GGUF metadata when loading model (#4092) b1613 Kerfuffle 2023-12-05 10:19:18 -07:00
  • af99c6fbfc
    llama : remove memory_f16 and kv_f16 flags gg/quantum-k-cache Georgi Gerganov 2023-12-05 18:18:16 +02:00
  • 4adb1d69d9
    cuda : add comment Georgi Gerganov 2023-12-05 18:15:51 +02:00
  • dd86df82e6
    metal : use mm kernel only for quantum KV cache Georgi Gerganov 2023-12-05 18:14:04 +02:00
  • 5ea96cc710 rebased mike dupont 2023-12-05 11:06:00 -05:00
  • 903167a777 llama-bench : support type_k/type_v slaren 2023-12-05 16:32:53 +01:00
  • b2acedeb1a
    cuda : add F32 -> Q4_0 and F32 -> Q4_1 copy kernels Georgi Gerganov 2023-12-05 16:47:34 +02:00
  • e8457c90a0
    cuda : wip Georgi Gerganov 2023-12-05 16:29:52 +02:00
  • 7bbe60576a Update new GET_KEY call KerfuffleV2 2023-12-05 07:14:49 -07:00
  • 6b58ae9892
    metal : add F32 -> Q4_1 copy kernel Georgi Gerganov 2023-12-05 16:09:16 +02:00
  • 2b6ff2ec54 rebased and trimmed down mike dupont 2023-11-21 11:25:37 -05:00
  • a59bf9395f Merge branch 'master' into feat-override-metadata KerfuffleV2 2023-12-05 07:06:53 -07:00
  • 9d69ecc0c9
    metal : add F32 -> Q4_0 copy kernel Georgi Gerganov 2023-12-05 16:01:50 +02:00
  • 7864a2cd9b
    llama : fix build Georgi Gerganov 2023-12-05 15:43:25 +02:00
  • 3ce30e07c9
    llama : pass KV cache type through API Georgi Gerganov 2023-12-05 15:40:23 +02:00
  • b6f952fd8d improved exit logic Concedo 2023-12-05 21:08:10 +08:00
  • ddd9971236
    Rename api-like-OAI.sh to chat-OAI.sh Yazan Agha-Schrader 2023-12-05 13:09:18 +01:00
  • 311f3b16dc
    Update examples/server/api-like-OAI.sh Yazan Agha-Schrader 2023-12-05 13:06:29 +01:00
  • 52c8bc3cf3
    sampling : custom samplers order (#4285) b1612 MaggotHATE 2023-12-05 15:05:51 +05:00
  • 0b87ef4fae Fixing whitespaces MaggotHATE 2023-12-05 14:59:54 +05:00
  • a6c3278845 Formatting fixes MaggotHATE 2023-12-05 14:14:39 +05:00
  • c879b6d183
    Merge branch 'ggerganov:master' into master MaggotHATE 2023-12-05 14:11:42 +05:00
  • 14e0ba1daa
    llama : rearrange model params Georgi Gerganov 2023-12-05 09:40:57 +02:00
  • 953c594d6d
    Merge 23987729aa into e4b76bbe31 kalomaze 2023-12-05 12:34:56 +05:00
  • e4b76bbe31
    swift : revert compiler checks for swift package (#4332) b1611 kchro3 2023-12-04 23:29:46 -08:00
  • 87fda0dd86 debugging Yazan Agha-Schrader 2023-12-05 07:34:24 +01:00
  • b564660042 Merge remote-tracking branch Yazan Agha-Schrader 2023-12-05 07:34:00 +01:00
  • e348898012 Revert compiler checks for swift package kchro3 2023-12-04 17:10:37 -08:00
  • 71596272c3 Revert "remove candidates_decoded" marcus 2023-12-04 14:29:59 -08:00
  • 3773328080 remove candidates_decoded marcus 2023-12-04 14:11:45 -08:00
  • eb9d1fcd7d reserve canidates_grammar marcus 2023-12-04 14:10:11 -08:00
  • 7cd0d3232f reserve canidates_decoded marcus 2023-12-04 14:09:49 -08:00
  • 5dd1f45e1d used precomputed token text for grammar sample marcus 2023-12-04 13:30:27 -08:00
  • 911a871968
    Merge branch 'ggerganov:master' into master Marcus Dunn 2023-12-04 10:55:42 -10:00
  • cae8f50b1a initial commit, going through initializations Leon Ericsson 2023-12-04 21:52:17 +01:00
  • 23b5e12eb5
    simple : update error message for KV cache check (#4324) b1610 Daniel Bevenius 2023-12-04 17:04:21 +01:00
  • d208995c6d
    swift : fix concatenation method to avoid invalid UTF8 stringfication (#4325) b1609 Miwa / Ensan 2023-12-05 01:03:49 +09:00
  • 0a4c81d7f6
    simple: update error message for KV cache check Daniel Bevenius 2023-12-04 16:32:40 +01:00