Commit graph

  • 35ce2b054f typo fixes Concedo 2023-11-18 11:05:04 +08:00
  • 45ad1b97f8 max nodes 8192 Concedo 2023-11-18 11:02:35 +08:00
  • af5a6ceb12 remove multibyte_pending from server qhduan 2023-11-18 10:30:54 +08:00
  • 506aadad87 fix deepseek bug in stream mode qhduan 2023-11-18 06:12:19 +08:00
  • b4680d7cd3 update mike dupont 2023-11-17 14:57:52 -05:00
  • bbecf3f415
    llama : increase max nodes (#4115) b1535 slaren 2023-11-17 20:39:11 +01:00
  • a4ceff3dfd adding support for distilbert/sbert Andrew 2023-11-17 14:04:32 -05:00
  • 9a7665c455 added support for sbert/distilbert model Andrew 2023-11-17 14:01:14 -05:00
  • 0867e20bc2 added support for sbert/distilbert model Andrew 2023-11-17 13:59:59 -05:00
  • 438387e191 added support for sbert/distilbert model Andrew 2023-11-17 13:58:06 -05:00
  • 81d305953f added support for sbert/distilbert model Andrew 2023-11-17 13:56:34 -05:00
  • 19d82c2244 added support for sbert/distilbert model Andrew 2023-11-17 13:49:27 -05:00
  • 573aefa737 llama : increase max nodes slaren 2023-11-17 17:33:49 +01:00
  • 8e9361089d
    build : support ppc64le build for make and CMake (#3963) b1534 Roger Meier 2023-11-17 17:11:23 +01:00
  • 5ad387e994
    tokenize : fix trailing whitespace b1533 Georgi Gerganov 2023-11-17 18:01:38 +02:00
  • 0389ed9ccb
    build: keep __POWER9_VECTOR__ ifdef and extend with __powerpc64__ Roger Meier 2023-11-17 16:46:33 +01:00
  • 2fa02b4b3d
    examples : add tokenize (#4039) b1532 zakkor 2023-11-17 17:36:44 +02:00
  • 2ab0707acb
    convert : use 'model' value if it exists. This allows karpathy/tinyllamas to load (#4089) Don Mahurin 2023-11-17 07:32:34 -08:00
  • 11173c92d6
    py : Falcon HF compatibility (#4104) John 2023-11-17 16:24:30 +01:00
  • 9e87ef60e1
    common : improve yaml log escaping (#4080) b1529 Jannis Schönleber 2023-11-17 16:24:07 +01:00
  • c7cce1246e
    llava : fix compilation warning that fread return value is not used (#4069) b1528 Huawei Lin 2023-11-17 10:22:56 -05:00
  • f7d5e97542
    py : remove superfluous import statements (#4076) Jiří Podivín 2023-11-17 16:20:53 +01:00
  • ba4cf5c0bf
    train : move number of gpu layers argument parsing to common/train.cpp (#4074) b1526 Jiří Podivín 2023-11-17 16:19:16 +01:00
  • e85bb1a8e7
    llama : add functions to get the model's metadata (#4013) b1525 slaren 2023-11-17 16:17:37 +01:00
  • 3e916a07ac
    finetune : speed-up ggml_compute_forward_out_prod_f32 via BLAS (#4079) b1524 gwjr 2023-11-17 14:48:19 +00:00
  • dd89015c13 simplify std::string specialization cebtenzzre 2023-11-17 07:57:11 -05:00
  • 947f64f163
    finetune : zero the loraB initial vectors (#4082) b1523 Andrew Godfrey 2023-11-17 02:23:11 -08:00
  • b83e149ec6
    cuda : get_row_rounding F32 (#4095) b1522 Andrew Godfrey 2023-11-17 00:01:15 -08:00
  • 4f447a4833
    llama : fix data units (#4101) b1521 Georgi Gerganov 2023-11-17 10:00:15 +02:00
  • 16868e208e Merge branch 'master' into feat-seqrep-sampler-simple KerfuffleV2 2023-11-16 19:43:46 -07:00
  • fdcd96868a
    Update llama.cpp John 2023-11-17 03:42:08 +01:00
  • 1d6b201883
    Update common.cpp John 2023-11-17 03:41:15 +01:00
  • c301973469 Respect tokenizer.ggml.add_bos_token value when tokenizing (#4040) Kerfuffle 2023-11-16 19:14:37 -07:00
  • b94b982042 gguf : fix potential infinite loops while parsing (#4100) texmex76 2023-11-16 16:01:48 +01:00
  • 4fc5f7df7c llama : restore prefix space in llama tokenizer (#4081) Jared Van Bortel 2023-11-15 11:34:47 -05:00
  • 208bdcd607 ggml-cuda : increase max graph size (#4084) slaren 2023-11-15 13:58:13 +01:00
  • affa88b90a Fix MacOS Sonoma model quantization (#4052) Michael Potter 2023-11-14 09:34:41 -08:00
  • 2751031e8c stablelm : StableLM support (#3586) Galunid 2023-11-14 11:17:12 +01:00
  • 3c76bd6427 convert.py: also look for plain model.safetensors (#4043) afrideva 2023-11-13 17:03:40 -08:00
  • 1844207862 Nuke obsolete GetArrayLen struct KerfuffleV2 2023-11-16 19:32:18 -07:00
  • 91f6499393
    Respect tokenizer.ggml.add_bos_token value when tokenizing (#4040) b1520 Kerfuffle 2023-11-16 19:14:37 -07:00
  • 8c9f776952 Refactor... basically everything! KerfuffleV2 2023-11-16 19:08:48 -07:00
  • aa094ace8e
    Token to piece (#1) John 2023-11-17 02:11:00 +01:00
  • 167f9b20fc Update OpenAI API to support text in mixed-mode data varon 2023-11-17 02:09:31 +02:00
  • 333f704f74 enable logging of cuda in app mike dupont 2023-11-16 19:06:11 -05:00
  • 8e9396aadd
    Falcon HF compatibility John 2023-11-16 23:24:34 +01:00
  • c5a45042f5 merge mike dupont 2023-11-16 16:18:30 -05:00
  • e3ae974d3d grammars mike dupont 2023-11-16 16:16:47 -05:00
  • e96944551a Add RMS Norm shader, rework op_f32 shader setup, fix matmul bug 0cc4m 2023-11-16 22:11:57 +01:00
  • da122af024 Add openBLAS support for sgemm() in compute_forward_out_prod() gwjr 2023-11-16 18:45:33 +00:00
  • 63d85d166b
    llama : disambiguate data units Georgi Gerganov 2023-11-16 17:48:49 +02:00
  • 82d97499b6
    Revert "llama : fix data units" Georgi Gerganov 2023-11-16 17:46:33 +02:00
  • f5feac831f
    llama : fix data units Georgi Gerganov 2023-11-16 17:19:35 +02:00
  • 8da46278e1
    gguf : fix potential infinite loops while parsing (#4100) b1519 texmex76 2023-11-16 16:01:48 +01:00
  • 765ec1c077 gguf: fix potential infinite loops while parsing Bernhard Gstrein 2023-11-16 13:41:53 +01:00
  • 4494a9f655 fix Bingxuan Wang 2023-11-16 17:02:22 +08:00
  • 69be5c3d6d Step1 KerfuffleV2 2023-11-16 01:09:41 -07:00
  • 9cecb7a613 remove unused code Bingxuan Wang 2023-11-16 15:50:34 +08:00
  • 3f4185b654 update and refactor Bingxuan Wang 2023-11-16 13:58:59 +08:00
  • 7d971ee3d9 Merge branch 'master' into regex_gpt2_preprocess Bingxuan Wang 2023-11-16 11:48:36 +08:00
  • a55a095119
    Update ggml-cuda.cu Andrew Godfrey 2023-11-15 18:58:12 -08:00
  • bf46304cbc
    Update ggml-cuda.cu Andrew Godfrey 2023-11-15 18:58:05 -08:00
  • e77094faab Fix #4017 Andrew Godfrey 2023-11-15 18:17:11 -08:00
  • e5c1f02645 Remove ggml_compute_forward_out_prod_use_blas(), fix compiling errors on cmake/zig, remove trailing whitespace gwjr 2023-11-15 23:00:44 +00:00
  • f824902623 YaRN : correction to GPT-NeoX implementation ceb/fix-yarn-neox Jared Van Bortel 2023-11-15 17:07:57 -05:00
  • 9d39deab8f Fix the one time GCC is stricter than clang about something KerfuffleV2 2023-11-15 12:14:19 -07:00
  • ba839d1dd0 feat: Allow overriding GGUF metadata when loading model KerfuffleV2 2023-11-15 11:51:27 -07:00
  • a3f708afce added more fields to the openai compatible completions APIs Concedo 2023-11-16 00:58:08 +08:00
  • a6fc554e26
    llama : restore prefix space in llama tokenizer (#4081) b1518 Jared Van Bortel 2023-11-15 11:34:47 -05:00
  • 4571bcc17f Use ggml_set_zero instead of adding a new function Andrew Godfrey 2023-11-15 08:05:40 -08:00
  • 2fa3e412ce use 'model' value if it exists. This allows karpathy/tinyllamas to load. Don Mahurin 2023-11-15 06:06:08 -08:00
  • 1cf2850d52
    ggml-cuda : increase max graph size (#4084) b1517 slaren 2023-11-15 13:58:13 +01:00
  • 914e375602 support custom dalle urls Concedo 2023-11-15 18:37:50 +08:00
  • 35a97e14b2 Merge branch 'master' into concedo_experimental Concedo 2023-11-15 16:59:53 +08:00
  • b4a36f4083 ggml-cuda : increase max graph size slaren 2023-11-15 09:25:24 +01:00
  • cc1f3fcfad
    Fix typo in convert.py wonjun Jang 2023-11-15 17:22:59 +09:00
  • 8b919b5b57 allow customized rope to use model set values Concedo 2023-11-15 16:21:52 +08:00
  • fdbef3b4cc logging: include review feedback Jannis Schönleber 2023-11-15 08:54:15 +01:00
  • 4e67165d92 merge changes Bingxuan Wang 2023-11-15 12:16:07 +08:00
  • c72c1b37de tabs to spaces Andrew Godfrey 2023-11-14 16:57:28 -08:00
  • 91eb33585b finetune : zero the loraB initial vectors Andrew Godfrey 2023-11-12 18:23:06 -08:00
  • 5e899428cb do not add space prefix if the first token is special Jared Van Bortel 2023-11-14 19:21:14 -05:00
  • 6b732a0bf8 Merge branch 'master' of github.com:ggerganov/llama.cpp Laura 2023-11-15 00:15:08 +01:00
  • 735ffe3d2f Revert "dont add space when using special tokens" Jared Van Bortel 2023-11-14 16:48:33 -05:00
  • 6ee4682271 logging: improve escaping in yaml output Jannis Schönleber 2023-11-14 19:08:54 +01:00
  • 6bb4908a17
    Fix MacOS Sonoma model quantization (#4052) b1516 Michael Potter 2023-11-14 09:34:41 -08:00
  • 2f0c5dcaf5 Use cblas_sgemm() to implement ggml_compute_forward_out_prod() gwjr 2023-11-14 15:20:32 +00:00
  • d75eae6333 Remove logically superfluous assertions and order by dimension gwjr 2023-11-14 15:08:09 +00:00
  • 868e0457e9 Removed superfluous import statements Jiri Podivin 2023-11-14 16:10:24 +01:00
  • 37d230c3f2 Moving number of gpu layers argument parsing to common/train.cpp Jiri Podivin 2023-11-14 15:57:48 +01:00
  • 36eed0c42c
    stablelm : StableLM support (#3586) b1515 Galunid 2023-11-14 11:17:12 +01:00
  • 030886e48a upload model and add git ignore Bingxuan Wang 2023-11-14 17:44:37 +08:00
  • 9fa2627cf5
    Merge pull request #1 from DOGEwbx/master Bingxuan Wang 2023-11-14 17:37:25 +08:00
  • c31263e0cb fix falcon preprocess and add deepseek coder Bingxuan Wang 2023-11-14 17:34:46 +08:00
  • 512d9746bf use a more conventional macro name cebtenzzre 2023-11-14 00:08:58 -05:00
  • 7962d0a789 detect linker version instead of compiler version cebtenzzre 2023-11-14 00:04:01 -05:00
  • e8a001467e Fix compilation warning that fread return value is not used Huawei Lin 2023-11-13 22:50:07 -05:00
  • a169862c51 gguf-py: Try to fix SpecialVocab giving up too easily for the Nth time KerfuffleV2 2023-11-13 18:38:17 -07:00
  • e7552a4d78 escaping ebnf mike dupont 2023-11-13 20:23:00 -05:00
  • ffa494eb14 tabs to spaces Andrew Godfrey 2023-11-13 17:14:16 -08:00