Commit graph

  • e8a3090508 fix a few missing 'static' specifiers Cebtenzzre 2023-09-15 15:07:40 -04:00
  • 4fe09dfe66
    llama : add support for StarCoder model architectures (#3187) Meng Zhang 2023-09-16 03:02:13 +08:00
  • 159c597581 train : revert changes for now Cebtenzzre 2023-09-15 14:59:44 -04:00
  • 500389f063 use -Wmissing-prototypes with clang++ Cebtenzzre 2023-09-15 14:53:18 -04:00
  • 72a72854ee fix: switch to space from tab Meng Zhang 2023-09-16 02:49:37 +08:00
  • fb40e91884 fix build numbers by setting fetch-depth=0 Cebtenzzre 2023-09-15 14:37:23 -04:00
  • 80291a1d02
    common : do not use GNU zero-length __VA_ARGS__ extension (#3195) Cebtenzzre 2023-09-15 14:02:01 -04:00
  • f2e331ca3b common : do not use GNU zero-length __VA_ARGS__ extension Cebtenzzre 2023-09-15 13:53:57 -04:00
  • 45d0c8089a do not use anonymous namespaces Cebtenzzre 2023-09-15 13:47:09 -04:00
  • 2e2273f4fb do not use trailing return types Cebtenzzre 2023-09-15 13:21:53 -04:00
  • e30ad7143f
    Apply suggestions from code review Meng Zhang 2023-09-16 01:20:53 +08:00
  • c6f1491da0
    metal : fix bug in soft_max kernels (out-of-bounds access) (#3194) Georgi Gerganov 2023-09-15 20:17:24 +03:00
  • eafcc34f0a
    Update llama.cpp Meng Zhang 2023-09-16 01:16:48 +08:00
  • bb9931cf92
    Update llama.cpp Meng Zhang 2023-09-16 01:14:55 +08:00
  • f989ba151d fix: remove max_position_embeddings, use n_train_ctx Meng Zhang 2023-09-16 00:56:19 +08:00
  • e1fa9dd24c
    Merge pull request #3 from TabbyML/support-starcoder-mqa Meng Zhang 2023-09-16 00:41:36 +08:00
  • 08f35c46a6 support-mqa-directly Meng Zhang 2023-09-16 00:36:47 +08:00
  • e3d87a6c36
    convert : make ftype optional in simple scripts (#3185) Cebtenzzre 2023-09-15 12:29:02 -04:00
  • 8e7eca7e70
    Merge a952716d35 into 8c00b7a6ff Qingyou Meng 2023-09-15 12:12:51 -04:00
  • 3e15ea9bac
    metal : fix bug in soft_max kernels (out-of-bounds access) Georgi Gerganov 2023-09-15 19:10:44 +03:00
  • 5ca037b9df add other starcoder models: 3B, 7B, 15B Meng Zhang 2023-09-16 00:10:37 +08:00
  • 8c00b7a6ff
    sync : ggml (Metal F32 support + reduce ggml-alloc size) (#3192) Georgi Gerganov 2023-09-15 19:06:03 +03:00
  • 57eaa39c16 refactor: cleanup comments a bit Meng Zhang 2023-09-16 00:05:32 +08:00
  • 63fcbbb3f1 Change label to avoid confusion - rocm hipblas users should obtain binaries from yellowrosecx fork. The rocm support in this repo requires self-compilation Concedo 2023-09-16 00:04:11 +08:00
  • 8b8eb18567 Merge branch 'master' into concedo_experimental Concedo 2023-09-15 23:51:18 +08:00
  • caa722095a
    Merge pull request #2 from ggerganov/support-starcoder-fix Meng Zhang 2023-09-15 23:13:14 +08:00
  • 92a4f86879
    llama : make starcoder graph build more consistent with others support-starcoder-fix Georgi Gerganov 2023-09-15 17:57:10 +03:00
  • f82328ab65
    metal : fix out-of-bounds access in soft_max kernels Georgi Gerganov 2023-09-15 17:56:49 +03:00
  • 7c80b34c8b
    Create dummyfile Ali Tariq 2023-09-15 18:17:32 +05:00
  • e004116de9
    Merge branch 'master' into master Rickard Hallerbäck 2023-09-15 15:14:56 +02:00
  • 7e50d34be6
    cmake : fix building shared libs for clang (rocm) on windows (#3176) Engininja2 2023-09-15 06:24:30 -06:00
  • b041a3fd23
    llama-bench : fix ggml_cpu_has_metal() duplicate function Georgi Gerganov 2023-09-15 14:57:59 +03:00
  • f1358606f0
    sync : ggml (Metal F32 support + reduce ggml-alloc size) Georgi Gerganov 2023-09-15 14:54:18 +03:00
  • 6c353dc7c2 cleanup useless code Meng Zhang 2023-09-15 19:00:14 +08:00
  • a1cf66ea94 working in cpu, metal buggy Meng Zhang 2023-09-15 16:56:50 +08:00
  • 6b7af55db5 updated lite Concedo 2023-09-15 16:44:13 +08:00
  • 235f7c193b
    flake : use pkg-config instead of pkgconfig (#3188) Evgeny Kurnevsky 2023-09-15 10:10:22 +02:00
  • a51b687657
    metal : relax conditions on fast matrix multiplication kernel (#3168) Georgi Gerganov 2023-09-15 11:09:24 +03:00
  • 4c85f04bed
    flake: use pkg-config instead of pkgconfig Evgeny Kurnevsky 2023-09-15 11:07:48 +03:00
  • 76164fe2e6
    cmake : fix llama.h location when built outside of root directory (#3179) Andrei 2023-09-15 04:07:40 -04:00
  • c2ab6fe661
    ci : Cloud-V for RISC-V builds (#3160) Ali Tariq 2023-09-15 13:06:56 +05:00
  • 2d770505a8
    llama : remove mtest (#3177) Roland 2023-09-15 03:28:45 -04:00
  • 101c578715 add TBD Meng Zhang 2023-09-15 15:23:50 +08:00
  • f1c3abcf71 Replaced Makefile with original one Ali Tariq 2023-09-15 11:48:03 +05:00
  • 8bc76a225d add input embeddings handling Meng Zhang 2023-09-15 14:47:04 +08:00
  • ab13d071e1 store mqa directly Meng Zhang 2023-09-15 14:18:36 +08:00
  • 4420cff654 fix vram calculation for starcoder Meng Zhang 2023-09-15 13:52:43 +08:00
  • dac31da489 fix comments Meng Zhang 2023-09-15 12:57:38 +08:00
  • 0be15e162c fix head count kv Meng Zhang 2023-09-15 12:56:20 +08:00
  • 77c7ec179c properly load all starcoder params Meng Zhang 2023-09-15 12:36:11 +08:00
  • 2683611944 set n_positions to max_positioin_embeddings Meng Zhang 2023-09-15 12:35:29 +08:00
  • a17ef39792 add max_position_embeddings Meng Zhang 2023-09-15 12:35:17 +08:00
  • 57f064d7c2 load starcoder weight Meng Zhang 2023-09-15 12:12:33 +08:00
  • 166a259f67 set head_count_kv = 1 Meng Zhang 2023-09-15 12:12:27 +08:00
  • 7298c37e7e add LLM_ARCH_STARCODER to llama.cpp Meng Zhang 2023-09-15 11:45:26 +08:00
  • 7e0a843b6a fix ffn_down name Meng Zhang 2023-09-15 11:45:18 +08:00
  • 76d32cca59 convert MQA to MHA Meng Zhang 2023-09-15 11:42:16 +08:00
  • eb7f0eba3e support convert starcoder weights to gguf Meng Zhang 2023-09-15 11:24:24 +08:00
  • 0c5d4d87b0 add placeholder of starcoder in gguf / llama.cpp Meng Zhang 2023-09-15 10:38:46 +08:00
  • c33a82ecd2 convert : make ftype optional in simple scripts Cebtenzzre 2023-09-14 21:21:20 -04:00
  • 98311c4277
    llama : make quantize example up to 2.7x faster (#3115) Cebtenzzre 2023-09-14 21:09:53 -04:00
  • f727ad5fc9 llama : don't zero-init vectors in quantize -> 5.1% faster Cebtenzzre 2023-09-09 17:30:16 -04:00
  • cd27e8ab32 check C++ code with -Wmissing-declarations Cebtenzzre 2023-09-14 19:03:50 -04:00
  • 50d9b41ec0 still need this nathan-sixnines 2023-09-14 17:01:31 -04:00
  • 34b1f00bdb Merge branch 'master' of https://github.com/ggerganov/llama.cpp into Markdownish_Codeblock_fix nathan-sixnines 2023-09-14 16:58:14 -04:00
  • 9bd84dab54 Merge branch 'master' of https://github.com/ggerganov/llama.cpp into Nextira_Skin nathan-sixnines 2023-09-14 16:51:17 -04:00
  • 0359428ccc nextira css nathan-sixnines 2023-09-14 16:50:38 -04:00
  • 703ef9c125 Set the singleton to nullptr here. master-703ef9c Adam Treat 2023-09-14 16:38:28 -04:00
  • c2217ca2ed
    Fix llama.h location when built outside of root directory Andrei 2023-09-14 16:23:22 -04:00
  • 76804fab1d
    exclude some more known zero values from computations in flash_attn_f32 & flash_attn_back_f32 xaedes 2023-09-14 22:18:20 +02:00
  • 00e1c707b1 remove from common/common.h and examples/main/main.cpp Roland Burke 2023-09-14 15:59:58 -04:00
  • e7e7b11455
    llama : remove experimental stuff mul-mat-pad Georgi Gerganov 2023-09-14 22:52:01 +03:00
  • b5e786f036 Remove mtest Roland Burke 2023-09-14 15:47:05 -04:00
  • e41209a95f llama_tokenize should accept strings containing NUL now goerch 2023-09-14 21:26:08 +02:00
  • feea179e9f
    flake : allow $out/include to already exist (#3175) jneem 2023-09-14 13:54:47 -05:00
  • 0ff4ebaa97 flake : allow $out/include to already exist Joe Neeman 2023-09-14 13:50:04 -05:00
  • c644c1e984 cmake : fix building shared libs for clang (rocm) on windows Engininja2 2023-09-14 12:21:16 -06:00
  • d88dae2980
    block tiling for out-prod inspired by mul-mat xaedes 2023-09-14 18:39:46 +02:00
  • 769266a543
    cmake : compile ggml-rocm with -fpic when building shared library (#3158) Andrei 2023-09-14 13:38:16 -04:00
  • cf8238e7f4
    flake : include llama.h in nix output (#3159) Asbjørn Olling 2023-09-14 19:25:00 +02:00
  • 4b8560e72a
    make : fix clang++ detection, move some definitions to CPPFLAGS (#3155) Cebtenzzre 2023-09-14 13:22:47 -04:00
  • 83a53b753a
    CI: add FreeBSD & simplify CUDA windows (#3053) Alon 2023-09-14 20:21:25 +03:00
  • 5c872dbca2
    falcon : use stated vocab size (#2914) akawrykow 2023-09-14 10:19:42 -07:00
  • c52991e020
    Merge 8556db5b25 into 990a5e226a xvolks 2023-09-14 20:05:34 +03:00
  • 990a5e226a
    cmake : add relocatable Llama package (#2960) b1226 bandoti 2023-09-14 14:04:40 -03:00
  • 980ab41afb
    docker : add gpu image CI builds (#3103) b1225 dylan 2023-09-14 09:47:00 -07:00
  • e394084166
    gguf-py : support identity operation in TensorNameMap (#3095) Kerfuffle 2023-09-14 10:32:26 -06:00
  • 4c8643dd6e
    feature : support Baichuan serial models (#3009) b1223 jameswu2014 2023-09-15 00:32:10 +08:00
  • 0971fee710
    reshuffle original sample order instead of the previous shuffled order xaedes 2023-09-14 18:21:23 +02:00
  • 35f73049af
    speculative : add heuristic algorithm (#3006) b1222 Leng Yue 2023-09-14 09:14:44 -07:00
  • 3a9c1d7f5a
    set lora_alpha to value of lora_r if it is not set via command line xaedes 2023-09-14 17:58:31 +02:00
  • a90bf494c9 Fix ugly compiler warning goerch 2023-09-14 17:57:29 +02:00
  • a95aa21dad llama : optimize vector use in quantize -> 179% faster Cebtenzzre 2023-09-09 13:43:41 -04:00
  • 0c6496840c llama : refactor k-quant mixture logic into a function Cebtenzzre 2023-09-09 13:25:15 -04:00
  • c7c0fcbfe0 Add missing change... goerch 2023-09-14 17:30:53 +02:00
  • 4d3a64fbb2 add endpoint to fetch true max context Concedo 2023-09-14 23:27:12 +08:00
  • 20cf1a4589
    use unrolled vec_mad in out_prod xaedes 2023-09-14 14:27:34 +02:00
  • 01b0105890 Simplify logic goerch 2023-09-14 17:12:14 +02:00
  • 5d528ed5be Merge branch 'master' of https://github.com/goerch/llama.cpp goerch 2023-09-14 17:08:12 +02:00
  • 64b0b7453e Fixing the last deviations from sentencepiece indicated by test-tokenizer-1 goerch 2023-09-14 17:05:04 +02:00