Commit graph

  • 20b8ff5064
    Add headers to nix packages William Behrens 2023-08-09 11:10:35 -05:00
  • fd026f419d Handle ENABLE_VIRTUAL_TERMINAL_PROCESSING more gracefully on earlier versions of Windows. Danny Daemonic 2023-08-09 08:12:14 -07:00
  • 57782c0bcb CUDA: Removed obsolete cmake CUDA arch JohannesGaessler 2023-08-09 17:07:09 +02:00
  • a07e6dd3ad revert cuda changes as they are bugggy Concedo 2023-08-09 22:36:41 +08:00
  • f8376c7e61 up ver, fixed compile (+1 squashed commits) Concedo 2023-08-09 21:23:33 +08:00
  • 7715eced38 Fix grammar-based sampling issue in server Martin Krasser 2023-08-09 15:14:21 +02:00
  • ba09f1c807 Merge branch 'master' into concedo_experimental Concedo 2023-08-09 21:18:34 +08:00
  • a3fa0abaaa for got to add newline Aniket 2023-08-09 09:16:30 -04:00
  • 3a7853d259 handle stablecode-completion-alpha-3b Concedo 2023-08-09 21:07:57 +08:00
  • 40a51ec6a3 adding CMakeLists.txt file in the conversion script directory Aniket 2023-08-09 09:06:47 -04:00
  • afb8f6ee6a removing 1 whitespace Aniket 2023-08-09 09:06:10 -04:00
  • 7d0404c393 adding newline in readme Aniket 2023-08-09 09:05:37 -04:00
  • 7b1f062620 adding add_subdirectory in examples dir CMakeLists.txt Aniket 2023-08-09 09:04:24 -04:00
  • d551906b7b
    Merge 6383bbfa5f into 25d43e0eb5 jon-chuang 2023-08-09 17:27:15 +08:00
  • 84f7995e48 Change LTO to option and other stuff Henri Vasserman 2023-08-09 11:24:08 +03:00
  • 7674422f3e Merge remote-tracking branch 'origin/master' into zig-fixes Henri Vasserman 2023-08-09 10:52:39 +03:00
  • 25d43e0eb5
    CUDA: tuned mul_mat_q kernels (#2546) master-25d43e0 Johannes Gäßler 2023-08-09 09:42:34 +02:00
  • 90058d96b0 sleep longer before exit Concedo 2023-08-09 15:28:07 +08:00
  • 2d71bf95cb Add --n-predict -2 for stopping generation on full context crasm 2023-08-09 02:17:56 -04:00
  • 487cd25086 metal : print error of load pipeline state jhen 2023-08-09 13:17:23 +08:00
  • 19cf2a8663 add idle field and up ver Concedo 2023-08-09 12:42:59 +08:00
  • 4b8a354895 cudatoolkit version Concedo 2023-08-09 12:25:21 +08:00
  • 159ad9269d up ver, set the cuda pool malloc lookahead back to 5% instead of 2% (+1 squashed commits) Concedo 2023-08-09 11:50:12 +08:00
  • 49f0bfd69d
    Update README.md Eve 2023-08-08 22:58:53 -04:00
  • 3919e67421
    Update README.md Eve 2023-08-08 22:58:46 -04:00
  • 193f295a3a
    Update llama.cpp Eve 2023-08-08 22:47:34 -04:00
  • be26777a6a add pp_threads support to other files netrunnereve 2023-08-08 22:19:59 -04:00
  • d854348992 perplexity only uses pp_threads netrunnereve 2023-08-08 21:30:12 -04:00
  • 5624a29c1f
    Merge branch 'ggerganov:master' into master Eve 2023-08-08 21:13:28 -04:00
  • d14c066f0c cleaning up to remove spaces and satisfy failed checks Aniket 2023-08-08 20:40:17 -04:00
  • 829565b13d better SQL JohannesGaessler 2023-08-09 01:31:26 +02:00
  • 4024f91a66
    Add intrinsics polyfills for AMD Henri Vasserman 2023-08-09 01:56:44 +03:00
  • 0246d0dd6f
    gptneox-main.cpp : map tensor names klosax 2023-08-09 00:54:21 +02:00
  • 7d5f4522dd
    convert-llama-h5-to-gguf.py : map tensor names klosax 2023-08-09 00:52:16 +02:00
  • f4d137d98c
    convert-gptneox-h5-to-gguf.py : map tensor names klosax 2023-08-09 00:50:11 +02:00
  • ece4fc185e
    map tensor names klosax 2023-08-09 00:48:33 +02:00
  • ab6212864c
    Merge 'origin/master' into hipblas Henri Vasserman 2023-08-09 00:37:01 +03:00
  • 28046d1e52
    Merge and update server-cfg Henri Vasserman 2023-08-09 00:36:11 +03:00
  • 5520876c3c cleaning up Makefile empty space before mearge Aniket 2023-08-08 14:28:34 -04:00
  • 08e94332fc cleaning up some earlier files used for experiments Aniket 2023-08-08 14:27:01 -04:00
  • 088eb86fbe updating gitignore Aniket 2023-08-08 14:21:14 -04:00
  • 223ddb77b3 updating makefile so my initial tests are not compiled Aniket 2023-08-08 14:19:30 -04:00
  • 3c0c155309
    Merge branch 'ggerganov:master' into master byte-6174 2023-08-08 14:14:02 -04:00
  • 9a09e6418f minor spacing update Aniket 2023-08-08 14:00:05 -04:00
  • 2a0138e5ea updating readme for instructions for compilation and use Aniket 2023-08-08 13:52:20 -04:00
  • ff9fae57d1 updating makefile so test scripts are not compiled Aniket 2023-08-08 13:45:00 -04:00
  • 97c809448f add plotting files JohannesGaessler 2023-08-08 19:34:31 +02:00
  • bb99064690 Merge branch 'fix-benchmark-matmult-constants' of https://github.com/goerch/llama.cpp into fix-benchmark-matmult-constants goerch 2023-08-08 19:13:21 +02:00
  • ea62e6eca9 Fix constants for matmul benchmark to work with Q4_0 goerch 2023-08-08 19:13:16 +02:00
  • 926d90fbab Merge branch 'master' into concedo_experimental Concedo 2023-08-09 01:09:04 +08:00
  • da53236bf3 database writing works JohannesGaessler 2023-08-08 19:05:10 +02:00
  • 793cfd136c fixed 70B detection again, try fix horde issues, fixed lite unicode issue, fixed cmake for cuda Concedo 2023-08-09 01:05:00 +08:00
  • 465cadd44c Refactor special tokens tokenization Igor Pissolati 2023-08-08 12:46:18 -03:00
  • ada6cce40f Replace trie with linear search Igor Pissolati 2023-08-08 11:43:29 -03:00
  • f424e48ee0 more zig stuff Henri Vasserman 2023-08-08 18:05:45 +03:00
  • f5bfea0580
    Allow passing grammar to completion endpoint (#2532) master-f5bfea0 Martin Krasser 2023-08-08 15:29:19 +02:00
  • 961f2ab73f Fix unicode in grammars (fixes #2501) Evan Jones 2023-08-05 22:58:41 -04:00
  • acfc5478ff
    CUDA: tighter VRAM scratch size for 65b/70b (#2551) master-acfc547 Johannes Gäßler 2023-08-08 14:38:16 +02:00
  • 7ed8d1fe7f
    llm.vim : multiline autocompletion, get rid of "^@" (#2543) chaihahaha 2023-08-08 20:07:02 +08:00
  • e7f94d6fdc
    vim : bring back simple llm.vim example Georgi Gerganov 2023-08-08 15:05:30 +03:00
  • 5d8b765966 CUDA: tighter VRAM scratch size for 65b/70b JohannesGaessler 2023-08-08 13:55:11 +02:00
  • 2d7baaf50f
    vim : streaming and more (#2495) AustinMroz 2023-08-08 06:44:48 -05:00
  • 1b7f2c6dab ggml-alloc: Don't try to re-use buffers of external tensors Sam Spilsbury 2023-08-08 12:33:21 +03:00
  • 1b080994ff ggml-alloc: Don't try to re-use buffers of external tensors Sam Spilsbury 2023-08-08 12:30:12 +03:00
  • 8b73356b2d Fix trailing whitespace Martin Krasser 2023-08-08 09:44:28 +02:00
  • 4566533296 Add test vocabularies goerch 2023-08-08 08:26:52 +02:00
  • 98f5b1fa85 Merge branch 'gguf' of https://github.com/goerch/llama.cpp into gguf goerch 2023-08-08 08:22:42 +02:00
  • bb89266334 Merge tokenizer fixes into the gguf branch. goerch 2023-08-08 08:13:09 +02:00
  • f1f85de815 Split BPE and SentencePiece vocabularies goerch 2023-08-08 07:23:01 +02:00
  • 2f9181f235 Trim whitespace from first 2 displayed tokens crasm 2023-08-07 20:11:47 -04:00
  • cfdc3494e3 Always print num tokens crasm 2023-08-07 19:03:28 -04:00
  • ca32203c88 CUDA: tuned mul_mat_q kernels JohannesGaessler 2023-08-06 23:07:54 +02:00
  • 4fc3776ceb Add another test case Igor Pissolati 2023-08-07 18:30:24 -03:00
  • 236c838d23 server : fix Probabilites not used if included empty str Jhen 2023-08-08 04:40:41 +08:00
  • 6f7dabab44 Add simple test for special tokens Igor Pissolati 2023-08-07 17:31:13 -03:00
  • d9791bb48b Add C API for adding special tokens Igor Pissolati 2023-08-07 17:30:12 -03:00
  • 65559a23c8
    Update gptneox-main.cpp klosax 2023-08-07 22:28:43 +02:00
  • 38fbb74038
    Merge branch 'master' into fix-2023 goerch 2023-08-07 21:24:22 +02:00
  • 00e9a228dc Remove getbufoneline usage, Add input bind example. Austin Mroz 2023-08-07 12:36:17 -05:00
  • f3c3b4b167
    Add --rope-scale parameter (#2544) master-f3c3b4b klosax 2023-08-07 19:07:19 +02:00
  • 3554080502 fixed blasbatchmul multiplier Concedo 2023-08-08 00:41:02 +08:00
  • 28ad80b6e4 Merge branch 'master' into concedo_experimental Concedo 2023-08-08 00:34:10 +08:00
  • 3c7d938d95 update lite, resize scratch buffers for blasbatch 2048 Concedo 2023-08-08 00:32:51 +08:00
  • e1175b8314
    README.md : Add info about using linear rope scaling klosax 2023-08-07 18:27:49 +02:00
  • 9348aa4df9 Metal implementation Cebtenzzre 2023-07-21 17:10:57 -04:00
  • 6aeb46b343 CUDA implementation Cebtenzzre 2023-07-18 22:28:27 -04:00
  • 8dec38c35c llama: implement NTK-By-Parts (NTKv2) RoPE scaling Cebtenzzre 2023-07-17 20:07:15 -04:00
  • 30b63f71bd
    multiline autocompletion, get rid of "^@" chaihahaha 2023-08-08 00:08:49 +08:00
  • 099119f532 Fixes to rebase Igor Pissolati 2023-08-07 12:59:11 -03:00
  • 8083ae347a gguf : minor stuff Georgi Gerganov 2023-08-07 19:02:18 +03:00
  • b2417d0dfb
    common.cpp : Add --rope-scale parameter klosax 2023-08-07 17:57:36 +02:00
  • f6d5fe3afc Use some tricks to eliminate the necessity for a new format Igor Pissolati 2023-06-22 11:29:51 -03:00
  • 1da82c551f Merge branch 'master' into gguf Georgi Gerganov 2023-08-07 18:53:03 +03:00
  • 41a2ed03e7 Ignore unusable json values Igor Pissolati 2023-06-20 19:20:53 -03:00
  • ca1fc20508 Fix issues revealed by CI Igor Pissolati 2023-06-20 01:27:36 -03:00
  • e468e75515 Remove trailing whitespaces Igor Pissolati 2023-06-19 23:03:58 -03:00
  • 7f9d720105 Better loading of special tokens from jsons Igor Pissolati 2023-06-19 16:00:13 -03:00
  • 0c14627438 Code cleanup Igor Pissolati 2023-06-19 14:52:57 -03:00
  • 61a98bc30a Improve support for special tokens Igor Pissolati 2023-06-18 20:11:01 -03:00
  • 4357e692ac
    gguf.py : use custom alignment if present klosax 2023-08-07 13:51:26 +02:00