Commit graph

  • f91f22042f cmake : fix misuse of cxx_flags Cebtenzzre 2023-09-30 21:32:22 -04:00
  • 3724ad695d gguf : eliminate MODEL_TENSOR_NAMES Cebtenzzre 2023-09-30 19:55:24 -04:00
  • df072d2d99 mpt : addendum to changeset:1be89c40 - use "req" parameter of GGUF_GET_KEY macro instead of duplicate code Jan Ploski 2023-10-01 01:48:47 +02:00
  • fd5e2268a5 gguf : fix typos Cebtenzzre 2023-09-30 19:47:06 -04:00
  • 26c253eda2 mpt : standardized all tensor names to follow GGUF spec Jan Ploski 2023-10-01 01:43:39 +02:00
  • 1be89c4002 mpt : addendum to changeset:84e30e8 - leave parameter clamp_kqv out from metadata rather than use 0.0 to indicate "no clamping" (more compliant with the current GGUF spec?) Jan Ploski 2023-10-01 01:14:07 +02:00
  • 00e8c5c5f6 mpt : quick fix to avoid "Strange model" warning when quantizing MPT models Jan Ploski 2023-10-01 00:49:13 +02:00
  • 84e30e891c mpt : protect against "clip_qkv": null in mpt-7b Jan Ploski 2023-10-01 00:32:33 +02:00
  • 9d6533baed
    Delete ToK2024.txt pudepiedj 2023-09-30 22:42:02 +01:00
  • 2b565916dd Support sqr and concat on metal, persimmon-8b-q4 runs correctly Phillip Kravtsov 2023-09-30 14:11:52 -07:00
  • 574a9e12cc Merge branch 'master' of github.com:ggerganov/llama.cpp into phillip-kravtsov/support-adept-persimmon-8b Phillip Kravtsov 2023-09-30 13:24:13 -07:00
  • 52759a8e31 cleanup vvhg1 2023-09-30 20:59:26 +02:00
  • 79619527b0 brought infill up to current main code vvhg1 2023-09-30 20:56:24 +02:00
  • ac0f1752ea fix missing semicolon vvhg1 2023-09-30 20:20:13 +02:00
  • 0dde56c15d Upload ToK2024 pudepiedj 2023-09-30 18:44:49 +01:00
  • 4b45d880ba updated lite Concedo 2023-10-01 01:10:30 +08:00
  • d7f06a2d17 partially revert changes Cebtenzzre 2023-09-30 12:52:15 -04:00
  • 15236e855b mpt : added an implementation based (mostly) on falcon integration, modified with deltas from ggml/examples/mpt Jan Ploski 2023-09-30 18:49:22 +02:00
  • b49792b044 CUDA: added support for ggml_clamp (see also: https://github.com/ggerganov/ggml/issues/545) Jan Ploski 2023-09-30 18:35:35 +02:00
  • fe7da9412c make : add missing blank line Cebtenzzre 2023-09-30 12:31:14 -04:00
  • a6b9535b92 Merge branch 'master' of https://github.com/ggerganov/llama.cpp into pr-3296 Cebtenzzre 2023-09-30 12:30:44 -04:00
  • f5ef5cfb18
    ggml-cuda : perform cublas mat mul of quantized types as f16 (#3412) b1299 slaren 2023-09-30 18:12:57 +02:00
  • f71068fd98 Add name of external file at end pudepiedj 2023-09-30 17:07:31 +01:00
  • 62ca9dc7b6 copy to llama.cpp as subdir Bailey Chittle 2023-09-30 08:21:24 -07:00
  • 2a5c27053e Enable external file and add datestamp pudepiedj 2023-09-30 16:35:28 +01:00
  • 39ddda27f4 disable fp16 mat mul completely with multi GPU slaren 2023-09-30 17:17:38 +02:00
  • 59937e45a3 rename CC_TURING to CC_VOLTA slaren 2023-09-30 14:28:27 +02:00
  • 45aa01ed71 naming improvement vvhg1 2023-09-30 14:00:47 +02:00
  • 7f52b3f08a cleanup vvhg1 2023-09-30 13:48:57 +02:00
  • a317730483
    infill in separate example (#2) vvhg1 2023-09-30 13:47:43 +02:00
  • 191de1e8a3 allow launching with kcpps files Concedo 2023-09-30 19:35:03 +08:00
  • 62832c57c4 ggml-cuda : perform cublas matrix multiplication of quantized types as fp16 slaren 2023-09-30 12:21:56 +02:00
  • 202e28a76a do not offload rope for old cublas (+1 squashed commits) Concedo 2023-09-30 17:12:09 +08:00
  • da09a02b81 Add multi-submit for command buffers 0cc4m 2023-09-30 10:32:11 +02:00
  • 5e6450161a functional merge Concedo 2023-09-30 12:31:57 +08:00
  • b84e210f0d merge new rope param nonsense Concedo 2023-09-30 11:33:30 +08:00
  • a7b0f4c499 Fix transient definitions in find pkg Mason M 2023-09-29 23:10:45 -03:00
  • 2c177a46bc revert 8283237 and only allow LLAMA_NATIVE on x86 like the Makefile netrunnereve 2023-09-29 21:34:44 -04:00
  • 8283237401
    march=native doesn't work for ios/tvos, so disable for those targets. also see what happens if we use it on msvc Eve 2023-09-30 01:12:43 +00:00
  • 062561d4ad Merge https://github.com/ggerganov/llama.cpp into llama_native netrunnereve 2023-09-29 20:56:28 -04:00
  • 45916f9078 set cmake LLAMA_NATIVE=ON by default netrunnereve 2023-09-29 20:55:11 -04:00
  • 84f7cea2f9 gguf : clean up SpecialVocab Cebtenzzre 2023-09-29 18:51:38 -04:00
  • 7fa5cbf8cc gguf : accept str for path in SpecialVocab.__init__ Cebtenzzre 2023-09-29 18:51:19 -04:00
  • d93cf1eab1 Merge branch 'master' of github.com:ggerganov/llama.cpp into phillip-kravtsov/support-adept-persimmon-8b Phillip Kravtsov 2023-09-29 15:28:01 -07:00
  • ea90d2aa8c gguf : avoid copy-pasted tensor names Cebtenzzre 2023-09-29 18:44:55 -04:00
  • 52f3cae832 gguf : add BERT, MPT, and GPT-J model architectures Cebtenzzre 2023-09-25 14:09:13 -04:00
  • 65de3281ea GGUF : GPT2 Support root 2023-09-30 00:31:41 +02:00
  • f28f52c6d0 Fix norm eps bug Phillip Kravtsov 2023-09-29 15:25:25 -07:00
  • f12863ed38 metal : fix hardcoded constants in mul_vec_q_n_f32 Cebtenzzre 2023-09-29 18:20:19 -04:00
  • 8c8b5b0666 convert : fix handling of added tokens Cebtenzzre 2023-09-29 18:00:31 -04:00
  • 3db04db2b8 update conversion script to directly take adept artifacts rather than .saftensors file Phillip Kravtsov 2023-09-29 14:59:51 -07:00
  • 846c51feac convert : use bytes_to_unicode from transformers Cebtenzzre 2023-09-29 17:41:55 -04:00
  • ec0ce978ff Add offload funcs Phillip Kravtsov 2023-09-29 14:17:39 -07:00
  • 42bfa889a6 Add q6_k support 0cc4m 2023-09-29 23:12:34 +02:00
  • b6591b5cb4 Merge upstream changes, fix conflicts 0cc4m 2023-09-29 23:12:03 +02:00
  • 64beaf76ab ggml-cuda : explicitly use cmath for log2 slaren 2023-09-29 22:17:49 +02:00
  • d6d7d0f043
    Fixes for more compiler warnings goerch 2023-09-29 21:24:05 +02:00
  • 3fa8c555ff
    Fix for compiler warning goerch 2023-09-29 20:48:57 +02:00
  • a2ddaad577
    Fix PR for recent change goerch 2023-09-29 20:40:11 +02:00
  • 6a16c36bc5
    Fix PR for recent change goerch 2023-09-29 20:34:42 +02:00
  • fad8a773c1
    Fix PR for recent change goerch 2023-09-29 20:23:20 +02:00
  • 607e3bffdc
    Fix for compiler warning goerch 2023-09-29 20:20:55 +02:00
  • 9cfb7145a1
    Fix PR for recent change goerch 2023-09-29 20:17:28 +02:00
  • c09330edba
    Fix PR for recent change goerch 2023-09-29 20:13:06 +02:00
  • 6ed3104b80 CLBlast: Add broadcast support for matrix multiplication shibe2 2023-09-29 21:33:46 +04:00
  • 16c06fe207
    Merge branch 'master' into falcon-tokenizer goerch 2023-09-29 19:29:51 +02:00
  • 40e07a60f9
    llama.cpp : add documentation about rope_freq_base and scale values (#3401) b1298 slaren 2023-09-29 18:42:32 +02:00
  • 777dae5dd0
    Update README.md slaren 2023-09-29 18:25:58 +02:00
  • 6d80a037c3
    Update README.md slaren 2023-09-29 18:22:55 +02:00
  • 1e3781cd30 add notice to hot topics slaren 2023-09-29 18:18:52 +02:00
  • bc34dd4f5b
    train : fix KQ_pos allocation (#3392) b1297 Georgi Gerganov 2023-09-29 19:05:18 +03:00
  • 7f89e40e52 Parse graph early to pre-record command buffers 0cc4m 2023-09-29 17:08:09 +02:00
  • 2486725682 llama.cpp : add documentation about rope_freq_base and scale values slaren 2023-09-29 16:58:31 +02:00
  • 1eb4de0f0d
    make sure KQ_pos is not reallocated in finetune xaedes 2023-09-29 16:28:25 +02:00
  • 2777a84be4
    llama : quantize up to 31% faster on Linux and Windows with mmap (#3206) b1296 Cebtenzzre 2023-09-29 09:48:45 -04:00
  • 66382f1e98
    Merge branch 'master' into HEAD Georgi Gerganov 2023-09-29 16:40:53 +03:00
  • 0a4a4a0982
    readme : update hot topics + model links (#3399) BarfingLemurs 2023-09-29 08:50:35 -04:00
  • b0670db34f
    llama : fix session saving/loading Georgi Gerganov 2023-09-29 15:47:21 +03:00
  • cf837a1bab
    Update README.md BarfingLemurs 2023-09-29 08:05:00 -04:00
  • 569550df20
    readme : add link to grammars app (#3388) Andrew Duffy 2023-09-29 07:15:57 -04:00
  • 93b42fc99b Merge remote-tracking branch 'origin/master' into lora-falcon slaren 2023-09-29 12:55:16 +02:00
  • 3374ff7324 solve alibi cpu error ds5t5 2023-09-29 02:48:06 -07:00
  • af19099ab1 rebase to the latest ds5t5 2023-09-29 01:13:41 -07:00
  • 70e4a997ae
    train : fix KQ_pos allocation Georgi Gerganov 2023-09-29 11:51:12 +03:00
  • 8b8c6d5052 resolve comments ds5t5 2023-09-25 16:03:14 -07:00
  • 42bcc5bedb add refact model ds5t5 2023-09-24 21:32:12 -07:00
  • d904aff040 trivial cleanups Phillip Kravtsov 2023-09-28 22:36:23 -07:00
  • 7473773c0b Merge branch 'master' of github.com:ggerganov/llama.cpp into phillip-kravtsov/support-adept-persimmon-8b Phillip Kravtsov 2023-09-28 22:36:14 -07:00
  • 47dcb9fcf5 remove prints from llama.cpp & fix merge Phillip Kravtsov 2023-09-28 22:21:00 -07:00
  • c71bf2c45c
    swift : fix build on xcode 15 (#3387) Jhen-Jie Hong 2023-09-29 13:25:13 +08:00
  • c28a6c5ba0 remove printing logic from ggml.c Phillip Kravtsov 2023-09-28 22:18:56 -07:00
  • fa92f6e827 clean up convert scripts Phillip Kravtsov 2023-09-28 22:16:59 -07:00
  • d0a7143f71 Merge branch 'master' of github.com:ggerganov/llama.cpp into phillip-kravtsov/support-adept-persimmon-8b Phillip Kravtsov 2023-09-28 22:11:28 -07:00
  • d61eed0a39 Produces correct outputs Phillip Kravtsov 2023-09-28 22:10:45 -07:00
  • 033e3bf844 prepare to merge parallel Concedo 2023-09-29 10:30:45 +08:00
  • 2c498c6a7c
    Update README.md Andrew Duffy 2023-09-28 22:06:16 -04:00
  • 1ccf2efddf
    Add link to grammars app per @ggernagov suggestion Andrew Duffy 2023-09-28 22:02:42 -04:00
  • 498e5ffeaf swift : fix build on xcode 15 Jhen 2023-09-29 09:03:34 +08:00
  • 2d218f8353 set g_state.log_callback for metal logging Meng Zhang 2023-09-28 16:32:52 -07:00
  • bc39553c90
    build : enable more non-default compiler warnings (#3200) b1292 Cebtenzzre 2023-09-28 17:41:44 -04:00