Commit graph

  • 0040524060 ggml : move ggml_nbytes_split to ggml-cuda.cu slaren 2023-12-14 18:17:07 +01:00
  • ae8b4840bc ggml : use ggml_row_size where possible slaren 2023-12-14 17:43:42 +01:00
  • cafcd4f895
    ggml : remove n_dims from ggml_tensor (#4469) b1640 slaren 2023-12-14 16:52:08 +01:00
  • a456d83bbe
    add fallback for m chip & fix compiler bugs (#4) Holden 2023-12-14 22:53:14 +08:00
  • 6dcdb57be0 ggml : remove n_dims from ggml_tensor slaren 2023-12-14 15:07:03 +01:00
  • c50e400163
    py : add protobuf dependency (#4466) wonjun Jang 2023-12-14 21:44:49 +09:00
  • 4b1e7603e5
    Add protobuf dependency wonjun Jang 2023-12-14 21:36:32 +09:00
  • 20a68a7030
    ggml : add ggml_row_size() (fixes llama out of space) (#4461) b1638 LostRuins 2023-12-14 20:13:33 +08:00
  • 03f97c6efa word2vec supermy 2023-12-14 20:05:44 +08:00
  • 80e78cea12
    tests : fix sizey -> sizez Georgi Gerganov 2023-12-14 14:00:11 +02:00
  • ed866f550b
    ggml : fix row size compute to avoid overflows Georgi Gerganov 2023-12-14 13:49:20 +02:00
  • 8a7cfaf946
    ggml : add ggml_row_size(), deprecate ggml_type_sizef() Georgi Gerganov 2023-12-14 13:27:50 +02:00
  • 7798587990 Workflow Build from experimental branch Concedo 2023-12-14 19:17:19 +08:00
  • ae3d829d0c manual workflow for generating builds instead Concedo 2023-12-14 19:00:58 +08:00
  • 3425e62745 llama : Add test for model load cancellation crasm 2023-12-14 04:47:54 -05:00
  • 8a3ceced04 update: change order Trần Đức Nam 2023-12-14 16:31:19 +07:00
  • aac7f0b944 Merge branch 'master' into concedo_experimental Concedo 2023-12-14 17:24:42 +08:00
  • 9abe2e44d1 llama : Add ability to cancel model load crasm 2023-12-14 04:03:25 -05:00
  • f0de4953ae fixed length exceeding max ctx Concedo 2023-12-14 16:58:41 +08:00
  • 04bd895311 Revert "Fixes "Not enough space in the context's memory pool" encountered on certain models, which seems to be caused by some imprecision related to the automatic casting of floating point values" Concedo 2023-12-14 16:46:29 +08:00
  • 53bbd1ee43 Merge branch 'pr_fix_buf_resize_type' into concedo_experimental Concedo 2023-12-14 16:45:18 +08:00
  • 05f7db4b29 do not cast to size_t, instead just use doubles Concedo 2023-12-14 16:43:34 +08:00
  • 2ea3934ec3 update: awq support llama-7b model Trần Đức Nam 2023-12-14 15:41:41 +07:00
  • 55e87c3749
    ggml : fix OpenCL broadcast requirement for ggml_mul (close #4453) b1637 Georgi Gerganov 2023-12-14 10:35:29 +02:00
  • c88fc19d59 Merge branch 'master' into concedo_experimental Concedo 2023-12-14 16:32:42 +08:00
  • 34b3dac66d Fixes "Not enough space in the context's memory pool" encountered on certain models, which seems to be caused by some imprecision related to the automatic casting of floating point values Concedo 2023-12-14 16:00:44 +08:00
  • 873637afc7
    convert : support loading vocab from fast tokenizer config (#3633) wonjun Jang 2023-12-14 17:09:34 +09:00
  • 1ad8f0d80e Fixes "Not enough space in the context's memory pool" encountered on certain models, which seems to be caused by some imprecision related to the automatic casting of floating point values Concedo 2023-12-14 16:00:44 +08:00
  • 0353a18401
    readme : update supported model list (#4457) BarfingLemurs 2023-12-14 02:38:49 -05:00
  • 0e31f53422 Revert "lowvram var defaults" Concedo 2023-12-14 15:14:11 +08:00
  • 146e3bb3aa
    Automatically generate Linux Binaries (#564) henk717 2023-12-14 07:48:22 +01:00
  • 8dd975653d removing existing yml files Concedo 2023-12-14 14:47:03 +08:00
  • ec2cf6ce7a Merge branch 'concedo' into concedo_experimental Concedo 2023-12-14 14:42:00 +08:00
  • 2393599462
    Update README.md BarfingLemurs 2023-12-13 21:56:16 -05:00
  • 7ee8df3dc9 keep simplifying the change required for UMA Erik Garrison 2023-12-14 01:06:37 +01:00
  • 35e95b6266
    change exception wonjun Jang 2023-12-14 08:33:10 +09:00
  • f7cb0a65ef remove script with unclear purpose Jared Van Bortel 2023-12-13 17:55:41 -05:00
  • 9af7f58b7b move kompute to a submodule Jared Van Bortel 2023-12-13 17:54:35 -05:00
  • b906e126ca kompute : fix compile warnings Jared Van Bortel 2023-12-13 17:30:38 -05:00
  • 747e1eafcf Merge commit '81bc9214a3' into nomic-vulkan Jared Van Bortel 2023-12-13 17:25:15 -05:00
  • 27631dbb6e separate shaders from kompute itself Jared Van Bortel 2023-12-13 17:22:19 -05:00
  • 3e09e127eb rename ggml-vulkan -> ggml-kompute Jared Van Bortel 2023-12-13 17:10:32 -05:00
  • 56430c3209 relicense Vulkan backend as MIT Jared Van Bortel 2023-12-13 16:54:06 -05:00
  • e2286a2025
    readme: add -DAMDGPU_TARGETS to linux cmake invocation person4268 2023-12-13 17:30:04 -05:00
  • b5ff3e45ee
    Merge 09279c86ce into 948ff137ec Richard Kiss 2023-12-13 20:36:28 +00:00
  • 948ff137ec
    server : fix handling of characters that span multiple tokens when streaming (#4446) b1634 shibe2 2023-12-13 23:57:15 +04:00
  • 4d98d9a656
    sync : ggml (SD ops, tests, kernels) (#4444) b1633 Georgi Gerganov 2023-12-13 21:54:54 +02:00
  • bc2cc6e69a cuda : fix bin bcast when src1 and dst have different types slaren 2023-12-13 19:26:57 +01:00
  • 6caf33cfc8 Merge branch 'master' of https://github.com/ggerganov/llama.cpp into rocm-amd-uma Erik Garrison 2023-12-13 19:22:50 +01:00
  • 405fc540d5 avoid using deprecated ROCm hipMallocHost Erik Garrison 2023-12-13 19:18:07 +01:00
  • e754a83a40 clarify build process for ROCm on linux with cmake Erik Garrison 2023-12-13 19:12:37 +01:00
  • c3b1c12fdd make mypy happy Jared Van Bortel 2023-12-13 13:03:57 -05:00
  • 8fabb0132c code style cleanup Jared Van Bortel 2023-12-13 13:03:24 -05:00
  • 0ec380a4f5
    Update stb_image.h Ikko Eltociear Ashimine 2023-12-14 02:59:39 +09:00
  • bd482a8b92
    Merge branch 'ggerganov:master' into common_json MaggotHATE 2023-12-13 22:57:15 +05:00
  • d59c0b3a56 AMD ROCm: handle UMA memory VRAM expansions Erik Garrison 2023-12-13 18:53:35 +01:00
  • 1609a94ae5
    metal : try to fix moe test by reducing expert size Georgi Gerganov 2023-12-13 19:19:39 +02:00
  • 70f806b821
    build : detect host compiler and cuda compiler separately (#4414) b1632 Jared Van Bortel 2023-12-13 12:10:10 -05:00
  • c8554b80be Merge branch 'master' of https://github.com/ggerganov/llama.cpp into ceb/fix-cuda-warning-flags ceb/fix-cuda-warning-flags Jared Van Bortel 2023-12-13 12:06:01 -05:00
  • d870a9fd2c get_flags.mk -> get-flags.mk Jared Van Bortel 2023-12-13 12:05:01 -05:00
  • a13ba03849
    cuda : restore correct im2col Georgi Gerganov 2023-12-13 18:40:44 +02:00
  • 694449f0e8
    metal : fix accuracy of dequantization kernels Georgi Gerganov 2023-12-13 18:32:52 +02:00
  • bc014485be
    cuda : restore im2col Georgi Gerganov 2023-12-13 18:12:38 +02:00
  • 306754791f
    Koboldcpp.sh Fix & Nocuda (#562) henk717 2023-12-13 17:06:58 +01:00
  • 936af26db1
    sync : ggml (SD ops, tests, kernels) Georgi Gerganov 2023-12-13 17:51:17 +02:00
  • cabfa0a61a server: Fix handling of characters that span multiple tokens when streaming shibe2 2023-11-20 14:57:26 +04:00
  • 2810151b98 update docs Concedo 2023-12-13 22:48:29 +08:00
  • e4bf96329e Add API key authentication for enhanced server-client security ShadowBeast 2023-12-13 16:04:51 +02:00
  • d6a9242df3 adds test file for gpt2 EC2 Default User 2023-12-13 13:54:49 +00:00
  • e447af669c Merge branch 'master' into concedo_experimental Concedo 2023-12-13 21:09:47 +08:00
  • 9fb13f9584
    common : add --version option to show build info in CLI (#4433) b1631 Siwen Yu 2023-12-13 20:50:14 +08:00
  • 113f9942fc
    readme : update hot topics Georgi Gerganov 2023-12-13 14:05:38 +02:00
  • 0e05e1fec3
    Merge e1241d9b46 into 799a1cb13b slaren 2023-12-13 12:04:28 +00:00
  • 799a1cb13b
    llama : add Mixtral support (#4406) b1629 slaren 2023-12-13 13:04:25 +01:00
  • e1241d9b46
    metal : switch to execution barriers + fix one of the barriers mixtral Georgi Gerganov 2023-12-13 13:56:45 +02:00
  • 6b105fa106 ifndef guards for LLAMA_LOG macros danemadsen 2023-12-13 20:19:07 +10:00
  • 109e7aa8ac
    metal : limit kernels to not use more than the allowed threads Georgi Gerganov 2023-12-13 10:55:17 +02:00
  • ab558ac2b3
    metal : fix soft_max kernels Georgi Gerganov 2023-12-13 10:54:17 +02:00
  • 2bfaeac518 common : add --version option Siwen Yu 2023-12-13 16:19:24 +08:00
  • c2c238b4f3 Merge branch 'master' into concedo_experimental Concedo 2023-12-13 14:49:03 +08:00
  • 4db9586547 do not display the "maybe" MMQ console output Concedo 2023-12-13 14:47:48 +08:00
  • 748f376746
    fix: Apply phi to merged updates teleprint-me 2023-12-12 23:34:47 -05:00
  • 1aa3392685
    Merge branch 'master' into phi-1 teleprint-me 2023-12-12 23:25:00 -05:00
  • 82e4f64578
    convert-hf : support for mixtral-instruct (#4428) Radek Pilar 2023-12-12 20:04:10 +01:00
  • 19dcce5c35 Fix: State-setting assert fails when ctx->logits_all is false Paul Tsochantaris 2023-12-12 19:02:42 +00:00
  • facb81b83c convert : make flake8 happy Radek Pilar 2023-12-12 19:57:17 +01:00
  • 90c12e6b3c
    ggml : do not use BLAS with ggml_mul_mat_id Georgi Gerganov 2023-12-12 20:05:58 +02:00
  • cacac25195 cmake : fix improper joining in generator expression Jared Van Bortel 2023-12-12 11:30:57 -05:00
  • cdf3cc3c17 cmake : make CUDA warning stuff properly conditional Jared Van Bortel 2023-12-12 11:27:41 -05:00
  • e30a8ad1ee cmake : capitalize variables Jared Van Bortel 2023-12-12 11:23:04 -05:00
  • b5b2cdff1d cmake : fix incorrect variable reference Jared Van Bortel 2023-12-12 11:19:18 -05:00
  • d6f74975a4 convert : use sentencepiece tokenizer for Mixtral-instruct Radek Pilar 2023-12-12 17:05:37 +01:00
  • cf75991cac convert : typo fix, add additional hyperparameters, use LLaMA arch for Mixtral-instruct Radek Pilar 2023-12-12 17:03:44 +01:00
  • ea4402bb0e
    test-backend-ops : add one more sum_rows test Georgi Gerganov 2023-12-12 17:03:38 +02:00
  • a51bc0c1c0
    metal : fix binary ops for ne10 % 4 != 0 Georgi Gerganov 2023-12-12 15:55:42 +02:00
  • 08eb99179a
    metal : add cpy f16 -> f32 kernel Georgi Gerganov 2023-12-12 14:14:15 +02:00
  • a742d9f9b7 gguf-py : bump version slaren 2023-12-12 12:46:33 +01:00
  • 6a419f4d19
    convert : support safetensors format Georgi Gerganov 2023-12-12 13:04:33 +02:00
  • fecac45658
    server : tweak default sampling parameters (#4367) kalomaze 2023-12-12 04:12:35 -06:00
  • 9494d7c477
    english : use typos to fix comments and logs (#4354) b1627 Richard Kiss 2023-12-12 01:53:36 -08:00