Commit graph

  • 847135aaa2 add convert script ngxson 2024-07-08 16:35:27 +02:00
  • c4dd11d1d3
    readme : fix web link error [no ci] (#8347) b4b4o 2024-07-08 22:19:24 +08:00
  • afd76e6254 fix: handle default Joan Martinez 2024-07-08 15:40:27 +02:00
  • 9d5089b5bf Add ff lora matmuls Lorenzo Toniazzi 2024-07-08 14:36:27 +01:00
  • f288ae1a14 SYCL : Reenabled mmvq path for the SYCL Nvidia Backend Alberto Cabrera 2024-07-04 16:55:57 +01:00
  • 2ec846d558
    sycl : fix powf call in device code (#8368) b3347 Alberto Cabrera Pérez 2024-07-08 14:22:41 +01:00
  • 50da3294c7 Merge branch 'master' into vulkan-build-integration Mason M 2024-07-08 10:13:23 -03:00
  • 0699a4ce1d Merge branch 'feat-jina-embeddings-v2-zh' of https://github.com/JoanFM/llama.cpp into feat-jina-embeddings-v2-zh Joan Martinez 2024-07-08 14:11:02 +02:00
  • 175391d4ee merge with master Joan Martinez 2024-07-08 14:05:33 +02:00
  • 3f2d538b81
    scripts : fix sync for sycl Georgi Gerganov 2024-07-08 13:51:31 +03:00
  • 79e2982788 update based on review comments ngxson 2024-07-08 11:59:01 +02:00
  • 2ee44c9a18 sync : ggml b3345 Georgi Gerganov 2024-07-08 10:39:50 +03:00
  • 6847d54c4f tests : fix whitespace (#0) Georgi Gerganov 2024-07-08 10:39:36 +03:00
  • fde13b3bb9 feat: cuda implementation for ggml_conv_transpose_1d (ggml/854) John Balis 2024-07-02 11:09:52 -05:00
  • e481eb5559 renames Lorenzo Toniazzi 2024-07-08 08:41:03 +01:00
  • 322216e0a3
    sync : ggml Georgi Gerganov 2024-07-08 10:39:50 +03:00
  • 4855d13555
    tests : fix whitespace (#0) Georgi Gerganov 2024-07-08 10:39:36 +03:00
  • db6186aaa0
    feat: cuda implementation for ggml_conv_transpose_1d (ggml/854) John Balis 2024-07-02 11:09:52 -05:00
  • 470939d483
    common : preallocate sampling token data vector (#8363) b3342 Kevin Wang 2024-07-08 03:26:53 -04:00
  • d72ab18ef1 Preallocate sampling token data vector Kevin Wang 2024-07-08 06:58:45 +00:00
  • 42724b4d02 Arm AArch64: minor code refactoring Dibakar Gope 2024-07-08 04:19:04 +00:00
  • 4ff0b223c3 Arm AArch64: minor code refactoring Dibakar Gope 2024-07-06 19:15:55 +00:00
  • 110d143ece Arm AArch64: minor code refactoring Dibakar Gope 2024-07-03 12:41:13 +00:00
  • 356464454b Arm AArch64: minor code refactoring, and add reference scalar code to quantize routines for new quant types Dibakar Gope 2024-07-03 12:38:11 +00:00
  • cbbfd69f42 Arm AArch64: minimize changes in ggml_compute_forward_mul_mat Dibakar Gope 2024-06-26 07:32:53 +00:00
  • ffbfabb517 Arm AArch64: simplify logic for calling gemm and gemv functions in ggml_compute_forward_mul_mat Dibakar Gope 2024-06-23 20:22:28 +00:00
  • 7a706067b5 Arm AArch64: minor code refactoring Dibakar Gope 2024-06-19 16:15:13 +00:00
  • cce236bc47 Arm AArch64: add multithreaded quantization support for the new types: Q4_0_4_4, Q4_0_4_8, and Q4_0_8_8 Dibakar Gope 2024-06-19 06:15:28 +00:00
  • a7055b7be5 Arm AArch64: add reference scalar gemm and gemv, and avoid dynamic memory allocations during quantization for Q4_0_4_4, Q4_0_4_8, and Q4_0_8_8 Dibakar Gope 2024-06-18 08:02:37 +00:00
  • 3c1ad5fe3c Arm AArch64: remove stale LLAMA_QKK_64 from CMakeLists.txt and delete build.zig Dibakar Gope 2024-06-14 13:00:04 +00:00
  • 79b6cdfe69 Arm AArch64: minor changes to skip the pr#7433 vec_dot code for arm cpus with SVE VL not equal to 256 bits Dibakar Gope 2024-06-14 12:30:32 +00:00
  • e2c1c47fa8 Arm AArch64: minor code changes for rebase Dibakar Gope 2024-06-05 06:05:26 +00:00
  • 7ac03e5fe8 retrigger checks Dibakar Gope 2024-05-31 18:44:25 +00:00
  • 5d10c218eb Arm AArch64: minor code change for resolving a build issue with server-windows Dibakar Gope 2024-05-31 04:33:13 +00:00
  • 746b57f4c3 Arm AArch64: minor code refactoring to split the Q4_0_AARC64 type into three separate types: Q4_0_4_4, Q4_0_4_8, and Q4_0_8_8 Dibakar Gope 2024-05-21 08:56:45 +00:00
  • 6f0dbf6ab0
    infill : assert prefix/suffix tokens + remove old space logic (#8351) b3341 Georgi Gerganov 2024-07-08 09:34:35 +03:00
  • ffd00797d8
    common : avoid unnecessary logits fetch (#8358) b3340 Kevin Wang 2024-07-08 02:31:55 -04:00
  • a657246d62 Arm AArch64: minor code refactoring for resolving a build issue with cmake Dibakar Gope 2024-05-16 12:15:48 +00:00
  • 8ee6779147 Arm AArch64: minor code refactoring for rebase Dibakar Gope 2024-05-01 06:53:48 +00:00
  • 441ab64989 Arm AArch64: add copyright claim only to ggml-aarch64.cpp and ggml-aarch64.h files Dibakar Gope 2024-04-29 15:01:54 +00:00
  • 43e12974ed Arm AArch64: add optimized GEMV and GEMM asm kernels for q4_0_q8_0 quantization and refactor code to address llama.cpp pr#5780 suggestions Dibakar Gope 2024-04-29 05:51:07 +00:00
  • 04ce3a8b19
    readme : add supported glm models (#8360) toyer 2024-07-08 13:57:19 +08:00
  • 6c8d8266b1 Arm AArch64: add optimized GEMV and GEMM asm kernels for q4_0_q8_0 quantization and refactor code to address llama.cpp pr#5780 suggestions Dibakar Gope 2024-04-25 03:57:15 +00:00
  • 81215ff43a Arm AArch64: add optimized GEMV and GEMM asm kernels for q4_0_q8_0 quantization and refactor code to address llama.cpp pr#5780 suggestions Dibakar Gope 2024-04-23 07:36:22 +00:00
  • 340ef07fca Arm AArch64: add optimized GEMV and GEMM asm kernels for q4_0_q8_0 quantization and refactor code to address llama.cpp pr#5780 suggestions Dibakar Gope 2024-04-22 08:08:17 +00:00
  • 002e36eaec Arm AArch64: optimized GEMV and GEMM kernels for q4_0_q8_0, and q8_0_q8_0 quantization Dibakar Gope 2024-02-28 17:33:41 +00:00
  • 17db6beda4 Update datautils.mjs Robert 2024-07-08 13:16:30 +08:00
  • f9d42c598b convert_hf : identify more added control tokens for SPM tokenziers Francis Couture-Harpin 2024-07-07 23:28:38 -04:00
  • a3d5b8eafd add supported glm models in readme toyer 2024-07-08 03:11:07 +00:00
  • 3d04d337ca fix lint RunningLeon 2024-07-08 10:32:57 +08:00
  • 3b7255085c Avoid unnecessary logits fetch Kevin Wang 2024-07-08 00:19:18 +00:00
  • 072d7c96c0
    Update convert_hf_to_gguf.py Where data meets intelligence 2024-07-07 16:50:56 -07:00
  • 9cac9cecc7
    Update convert_hf_to_gguf_update.py Where data meets intelligence 2024-07-07 16:50:24 -07:00
  • 6597a72c1d Remove files Lorenzo Toniazzi 2024-07-07 22:09:30 +01:00
  • 6e351e0425 convert_hf : identify which user-defined tokens are control tokens Francis Couture-Harpin 2024-07-07 16:59:00 -04:00
  • 56df1fcdcb llama : fix detection of control-like user-defined tokens Francis Couture-Harpin 2024-07-07 16:13:35 -04:00
  • 6b961e3d24 Merge branch 'master' into compilade/fix-mpt-pretok Francis Couture-Harpin 2024-07-07 15:33:20 -04:00
  • d5d30b20c3 llama : pre-tokenize non-special user-defined tokens first Francis Couture-Harpin 2024-07-07 15:32:42 -04:00
  • 3fd62a6b1c
    py : type-check all Python scripts with Pyright (#8341) compilade 2024-07-07 15:04:39 -04:00
  • 86ccd30983 ci : only show warnings and errors in python type-check compilade/pyright-tests Francis Couture-Harpin 2024-07-07 14:08:19 -04:00
  • 244811d856
    fix and speed up compilaton Dmitry Wolf 2024-07-07 20:51:51 +03:00
  • ac0f33c920 Merge branch 'master' into compilade/fix-mpt-pretok Francis Couture-Harpin 2024-07-07 11:36:17 -04:00
  • 6ec70c93be tests : fix test-tokenizer-random.py Francis Couture-Harpin 2024-07-07 11:25:07 -04:00
  • 975524bc0d
    infill : assert prefix/suffix tokens + remove old space logic Georgi Gerganov 2024-07-07 18:13:25 +03:00
  • a8db2a9ce6
    Update llama-cli documentation (#8315) Denis Spasyuk 2024-07-07 09:08:28 -06:00
  • 6f215f1f0d py : fix new type errors from master branch Francis Couture-Harpin 2024-07-07 10:59:32 -04:00
  • 4090ea5501
    ci : add checks for cmake,make and ctest in ci/run.sh (#8200) Alex Tuddenham 2024-07-07 15:59:14 +01:00
  • 0caf60a79e Merge branch 'master' into compilade/pyright-tests Francis Couture-Harpin 2024-07-07 10:51:30 -04:00
  • 30faf1f3de fix auto merge ngxson 2024-07-07 16:36:50 +02:00
  • a1666aaaca Merge branch 'master' into xsn/fix_lora ngxson 2024-07-07 16:35:41 +02:00
  • 874216b9c8 remove unused members Hongrui Chen 2024-07-07 22:32:43 +08:00
  • 872aecbf30 ci : disable pip cache in type-check workflow Francis Couture-Harpin 2024-07-07 10:02:38 -04:00
  • f6d090d7de add llm_build_mm ngxson 2024-07-07 16:01:05 +02:00
  • f1948f1e10
    readme : update bindings list (#8222) Andy Tai 2024-07-07 06:21:37 -07:00
  • c5009e6128 py : switch to snake_case (#8305) Georgi Gerganov 2024-07-05 07:53:33 +03:00
  • 6d6ecd3200 cli: add EOT when user hit Ctrl+C (#8296) Xuan Son Nguyen 2024-07-04 20:55:03 +02:00
  • cbfc850793 llama : add OpenELM support (#7359) Icecream95 2024-07-05 05:14:21 +12:00
  • 63c6e90eab tokenize : add --show-count (token) option (#8299) Daniel Bevenius 2024-07-04 18:38:58 +02:00
  • 498d561ab1 build: Export hf-to-gguf as snakecase ditsuke 2024-07-04 20:54:35 +05:30
  • cb46165d9e doc: Add context for why we add an explicit pytorch source ditsuke 2024-07-03 01:02:56 +05:30
  • ba8aea8457 chore: Remove rebase artifacts ditsuke 2024-07-02 15:48:13 +05:30
  • 1d1fea0b6e chore: Fixup requirements and build ditsuke 2024-07-02 15:35:43 +05:30
  • 1ee5d59f67 chore: ignore all __pychache__ ditsuke 2024-07-02 15:18:13 +05:30
  • 3aefc742fe fix: Update script paths in CI scripts ditsuke 2024-03-10 23:21:46 +05:30
  • 84f249c4e8 fix: Actually include scripts in build ditsuke 2024-02-29 01:47:15 +05:30
  • 2c753017ae build(python): Package scripts with pip-0517 compliance ditsuke 2024-02-27 12:01:02 +05:30
  • ff2ca9cfb7 Inference support for T5 and FLAN-T5 model families (#5763) fairydreaming 2024-07-04 15:46:11 +02:00
  • 3a710b6aaf tests : add _CRT_SECURE_NO_WARNINGS for WIN32 (#8231) Daniel Bevenius 2024-07-04 12:53:42 +02:00
  • ef1600090f llama : suppress unref var in Windows MSVC (#8150) Daniel Bevenius 2024-07-04 12:50:57 +02:00
  • e9d503a5d7 convert : fix gemma v1 tokenizer convert (#8248) Georgi Gerganov 2024-07-04 10:41:03 +03:00
  • ab0e5dee19 Define and optimize RDNA1 (#8085) Daniele 2024-07-03 23:02:58 +00:00
  • 80ffd6e497 ppl : fix n_seq_max for perplexity (#8277) slaren 2024-07-03 19:33:31 +02:00
  • 40a2a1b936 fix phi 3 conversion (#8262) Xuan Son Nguyen 2024-07-03 16:01:54 +02:00
  • f7cab35ef9
    gguf-hash: model wide and per tensor hashing using xxhash and sha1 (#8048) b3334 Brian 2024-07-07 22:58:43 +10:00
  • 6b5c5aff25
    Merge branch 'ggerganov:master' into vulkan-build-integration bandoti 2024-07-07 09:52:15 -03:00
  • 905942abdb
    llama : support glm3 and glm4 (#8031) b3333 toyer 2024-07-07 20:52:10 +08:00
  • b5040086d4
    llama : fix n_rot default (#8348) b3332 Georgi Gerganov 2024-07-07 14:59:02 +03:00
  • 4dd707e653 gguf_hash: renaming gguf-hash.py --> gguf_hash.py brian khuu 2024-07-07 21:54:45 +10:00
  • 2af8aa39e3
    Update examples/gguf-hash/gguf-hash.cpp Brian 2024-07-07 21:46:20 +10:00
  • 4e85b06de7 fix by comments toyer 2024-07-07 11:42:54 +00:00