Commit graph

  • e24b4a713e
    ggml-alloc : minor fix Georgi Gerganov 2023-08-28 14:05:12 +03:00
  • 93497ac66b
    ggml : sync (mem align to header + conv_transpose_2d fixes) Georgi Gerganov 2023-08-28 13:52:41 +03:00
  • feda281d85 made method const mendax0110 2023-08-28 12:30:03 +02:00
  • 2d323b5711 made the methods const mendax0110 2023-08-28 12:17:46 +02:00
  • cf5d918073
    Koboldcpp-ROCm Port (#399) YellowRoseCx 2023-08-28 04:05:06 -05:00
  • dd0dc366da
    llama.h : add missing struct keyword for C compat in callback type (#2847) b1100 igarnier 2023-08-28 10:19:59 +02:00
  • f55538c3cc
    metal : fix memory leak (#2762) b1099 Georgi Gerganov 2023-08-28 10:59:08 +03:00
  • ddfa865926
    ggml : assert for odd number of blocks on ARM Georgi Gerganov 2023-08-28 10:49:47 +03:00
  • fffd167069
    metal : reuse array for command buffers and encoders Georgi Gerganov 2023-08-28 10:49:27 +03:00
  • 43a8a6297b
    metal : reuse dispatch queue + autoreleasepool Georgi Gerganov 2023-08-28 09:57:36 +03:00
  • 67dd7463ce
    metal : fix more leaks Georgi Gerganov 2023-08-25 19:05:21 +03:00
  • e7c4cccef6
    metal : clean up more memory resources Georgi Gerganov 2023-08-25 09:36:45 +03:00
  • 59196290f8
    metal : fix encoders memory leak Georgi Gerganov 2023-08-24 20:59:10 +03:00
  • f8e816e3f7
    metal : fix memory leak Georgi Gerganov 2023-08-24 13:14:22 +03:00
  • 09c10e8312 server : avoid aniprompt in probabilities of final response Jhen 2023-08-28 14:44:20 +08:00
  • ebcee207b6
    quantize : make output filename optional again (#2823) b1098 Cebtenzzre 2023-08-28 02:32:25 -04:00
  • 3e8ff47af6
    devops : added systemd units and set versioning to use date. (#2835) JohnnyB 2023-08-28 07:31:24 +01:00
  • 8445b8767f make : fix clean and make sure C test fails on clang Cebtenzzre 2023-08-28 02:22:40 -04:00
  • 4b00916ac7 Merge branch 'master' into concedo_experimental Concedo 2023-08-28 14:19:05 +08:00
  • 59fa94e44b make : build C compliance test by default Cebtenzzre 2023-08-28 02:14:07 -04:00
  • 39ace3668c tests : add a C compliance test Cebtenzzre 2023-08-28 01:57:57 -04:00
  • a076a7e80b llama.h: add missing struct keyword for C compat in callback type Ilias Garnier 2023-08-28 06:41:53 +02:00
  • 8e00404bc3
    typo fixed JackJollimore 2023-08-28 01:33:56 -03:00
  • 0a7ef3a5f9
    crystal-clear move path destination JackJollimore 2023-08-28 00:17:47 -03:00
  • 8cb4ee0f4c
    Clarify move model directions JackJollimore 2023-08-27 22:16:45 -03:00
  • 1f83343498
    bug fix in read_tensor_by_name xaedes 2023-08-28 02:02:05 +02:00
  • 152cfaac36
    bug fix: init model when no checkpoint was loaded xaedes 2023-08-28 01:48:21 +02:00
  • 4882ff0c59
    bug fixes in load_llama_model_gguf xaedes 2023-08-28 01:47:45 +02:00
  • 76d2794e11
    bug fixes in tokenize_file xaedes 2023-08-28 01:47:31 +02:00
  • 5d94997a09
    add gguf example cmake file xaedes 2023-08-28 01:46:53 +02:00
  • ca5b344fb1
    fix memory corruption bug in gguf xaedes 2023-08-28 01:46:37 +02:00
  • 0b2c85b025
    use norm_rms_eps, and rope parameters and command line options to set them xaedes 2023-08-27 23:39:21 +02:00
  • 91a4ccaf96
    use same GGUF_GET_KEY macro as in llama.cpp xaedes 2023-08-27 23:32:49 +02:00
  • d71069c4fb
    add layer_norm_rms_eps to checkpoint convert script xaedes 2023-08-27 23:25:41 +02:00
  • ef899fbe89
    add gguf key and tensor names for optimizer and training xaedes 2023-08-27 23:21:59 +02:00
  • 495a62a142
    save opt parameter counter as uint64 xaedes 2023-08-27 23:21:08 +02:00
  • cb42324d6a
    add gguf arch and ftype xaedes 2023-08-27 23:20:18 +02:00
  • a6f3a47c39
    Merge branch 'master' into pr-train-mem-usage-improvements xaedes 2023-08-27 23:11:47 +02:00
  • 3a91c975a6
    add first draft for checkpoint conversion script xaedes 2023-08-27 22:05:36 +02:00
  • 0c494cc60e
    save & load opt->just_initialized value xaedes 2023-08-27 22:05:24 +02:00
  • e69b7b5a38
    corrected navigation JackJollimore 2023-08-27 16:55:56 -03:00
  • ee55b1e528
    dummy-proof directions JackJollimore 2023-08-27 16:28:28 -03:00
  • 83df548b86
    clean up Usage example JackJollimore 2023-08-27 16:09:13 -03:00
  • 103cfafc77
    gguf : fix strings to not be null-terminated (#2839) b1096 Georgi Gerganov 2023-08-27 21:50:22 +03:00
  • e23553c6c1
    gguf : fix gguf_add_tensor name Georgi Gerganov 2023-08-27 21:21:44 +03:00
  • cd7b3edeed
    clarified wording JackJollimore 2023-08-27 14:51:57 -03:00
  • 364d684b9a
    Improve README.md for building in Termux on Android devices JackJollimore 2023-08-27 14:44:20 -03:00
  • 34e5c9afe5
    gguf : fix strings to not be null-terminated Georgi Gerganov 2023-08-27 20:43:32 +03:00
  • 9c8b14bc47
    Improve README.md for building in Termux on Android devices. JackJollimore 2023-08-27 14:27:37 -03:00
  • 46395e6311
    Merge branch 'ggerganov:master' into systemd-units JohnnyB 2023-08-27 17:03:33 +01:00
  • c10704d01e
    llama : fix MPI threads (close #2827) b1095 Georgi Gerganov 2023-08-27 18:55:41 +03:00
  • 6206ed6b72
    Missing dependency clblast JohnnyB 2023-08-27 16:55:02 +01:00
  • bb24276c69 quantize : fix path parsing on Windows Cebtenzzre 2023-08-27 11:48:35 -04:00
  • aeb19505bc Corrections and systemd units John Boero 2023-08-27 16:40:01 +01:00
  • 230d46c723
    examples : update llama2.c converter to read vocab and write models in GGUF format (#2751) b1094 Olivier Chafik 2023-08-27 15:13:31 +01:00
  • c2899b0fd1 CUDA: fix RoPE asserts, block sizes JohannesGaessler 2023-08-27 13:59:25 +02:00
  • 463173a6c0
    llama : speedup tokenization (#2831) b1093 Kawrakow 2023-08-27 16:50:33 +03:00
  • 2fae21ea78 llama-bench : set locale to utf8 slaren 2023-08-27 15:28:41 +02:00
  • 86e3511500 Fixit: it was missing the piece after the last found occurence Iwan Kawrakow 2023-08-27 16:43:50 +03:00
  • eaa13a48ff
    falcon : fix CUDA inference by making K and Q contiguous (#2830) b1092 Georgi Gerganov 2023-08-27 16:40:48 +03:00
  • 5021d7bc3f Speedup tokenization Iwan Kawrakow 2023-08-27 16:28:40 +03:00
  • cc924c57ee cuda : add assert to guard from non-cont ropes Georgi Gerganov 2023-08-27 16:00:55 +03:00
  • 7c55447f7f falcon : fix CUDA inference by making K and Q contiguous Georgi Gerganov 2023-08-27 15:56:03 +03:00
  • da7455d046
    readme : fix headings Georgi Gerganov 2023-08-27 15:52:34 +03:00
  • 25423e9185
    scripts : helper convert script Georgi Gerganov 2023-08-27 15:24:40 +03:00
  • a6d1189fdd
    k_quants tuning for Falcon-7b (#2816) b1089 Kawrakow 2023-08-27 15:19:59 +03:00
  • c48c5bb0b0
    readme : update hot topics Georgi Gerganov 2023-08-27 14:44:35 +03:00
  • d0cee0d36d
    gguf : add 64-bit support (GGUF v2) (#2821) b1087 Georgi Gerganov 2023-08-27 14:19:54 +03:00
  • edd4c14817
    llama : more tokenizer fixes (#2810) b1086 Georgi Gerganov 2023-08-27 14:19:19 +03:00
  • 841983fe47
    common : temporary separate llama_detokenize calls for SPM and BPE Georgi Gerganov 2023-08-27 13:04:04 +03:00
  • 21df40d0c4 fix offloading logic JohannesGaessler 2023-08-27 11:21:26 +02:00
  • 3bb0f84932
    tests : add falcon tests (py + cpp, currently do not pass Unicode) Georgi Gerganov 2023-08-27 11:26:48 +03:00
  • 061f777de0 k_quants tuning for Falcon-7b Iwan Kawrakow 2023-08-27 11:33:19 +03:00
  • 18a131d5e3 Make ggml-cuda.cu build with QK_K = 64 Iwan Kawrakow 2023-08-26 19:35:24 +03:00
  • 1591e2e590
    ggml : detect SSSE3 (#2825) b1085 Przemysław Pawełczyk 2023-08-27 10:10:25 +02:00
  • 958f5f7038
    add test with q8_0 (cpu only) slaren 2023-08-18 18:44:53 +02:00
  • f430e7f821
    add 7b lora test slaren 2023-08-18 18:01:00 +02:00
  • acca961dd7
    ci : decrease CPU ppl runs to 2 to avoide 20 min timeout Georgi Gerganov 2023-08-18 13:02:56 +03:00
  • 6e5297bc16
    move lora summary to the top, add lora logs slaren 2023-08-18 03:14:58 +02:00
  • 465a98886f
    ci : add lora test slaren 2023-08-18 02:54:25 +02:00
  • 7f1c434e73 quantize : make output filename optional again Cebtenzzre 2023-08-27 01:21:27 -04:00
  • 2d7a0fbe68 Replace make_half2 with __halves2half2 lijiahao 2023-08-27 11:14:32 +08:00
  • af31f1f00d Use make_half2 for better compatibility lijiahao 2023-08-27 11:06:28 +08:00
  • 9d5b4238e8 added config to class.py Concedo 2023-08-27 10:32:01 +08:00
  • eed5d0e386 llama : show SSSE3 in system info Przemyslaw Pawelczyk 2023-08-27 02:26:32 +02:00
  • dd8d05b918 ggml : add ggml_cpu_has_ssse3 Przemyslaw Pawelczyk 2023-08-27 02:25:13 +02:00
  • 789c8c945a
    ci : add LoRA test to CI (#2650) slaren 2023-08-27 09:03:27 +02:00
  • c1ac54b77a
    server : add /detokenize endpoint (#2802) b1083 Bruce MacDonald 2023-08-26 16:11:45 -07:00
  • 21757ee5b6 Added FIM token IDs. apaz-cli 2023-08-26 17:11:41 -05:00
  • c58792c768 llama2.c converter: cleanups + take n_ff from config ochafik 2023-08-26 23:09:22 +01:00
  • 33a5517d87
    llama.cpp : print gguf version gguf-64bit klosax 2023-08-26 23:56:48 +02:00
  • dbcf470bc6
    hellaswag : move the concat space for clarity Georgi Gerganov 2023-08-27 00:44:49 +03:00
  • ab3ba64f62
    llama.cpp : fix LF token klosax 2023-08-26 23:03:01 +02:00
  • 0722e58ac2 llama2.c: escape whitespaces w/ U+2581 in vocab converter the llama.cpp way ochafik 2023-08-26 22:43:00 +01:00
  • c767746399
    Merge branch 'master' into fix-tokenizer Georgi Gerganov 2023-08-27 00:42:05 +03:00
  • eb8b3264f6
    tests : add test-tokenizer-1.py Georgi Gerganov 2023-08-27 00:41:44 +03:00
  • 20c44711bc llama2.c: use defines for gguf keys ochafik 2023-08-26 21:41:53 +01:00
  • b61b170005
    gguf : fix typo Georgi Gerganov 2023-08-26 23:14:19 +03:00
  • 730d9c681e
    convert.py : advanced option (#2753) Kerfuffle 2023-08-26 14:13:36 -06:00
  • df3b81ab29 llama2.c: update default path for vocab model + readme ochafik 2023-08-26 20:59:46 +01:00