Commit graph

  • e91a2224e4
    convert-llama-h5-to-gguf.py : n_layer --> n_block klosax 2023-08-13 00:02:44 +02:00
  • 489616e126
    convert-gptneox-h5-to-gguf.py : n_layer --> n_block klosax 2023-08-13 00:02:04 +02:00
  • d2ce9cfe8d
    gguf.py : n_layer --> n_block klosax 2023-08-13 00:01:20 +02:00
  • 8b5f0c5067
    constants.py : n_layer --> n_block klosax 2023-08-13 00:00:32 +02:00
  • 43e726d9c0 adds simple grammar parsing tests drbh 2023-08-12 17:53:57 -04:00
  • 5e58ffa1ed
    gptneox-main.cpp : n_layer --> n_block klosax 2023-08-12 23:50:58 +02:00
  • e606ffeaee
    convert-llama-h5-to-gguf.py : simplify nbytes klosax 2023-08-12 22:30:35 +02:00
  • f8218477b3
    convert-gptneox-h5-to-gguf.py : simplify nbytes klosax 2023-08-12 22:29:35 +02:00
  • 4cef57c81a
    convert-llama-h5-to-gguf.py : no need to convert tensors twice klosax 2023-08-12 21:50:24 +02:00
  • 8f09157ec9
    convert-gptneox-h5-to-gguf.py : no need to convert tensors twice klosax 2023-08-12 21:48:58 +02:00
  • 5d81a715d4
    gguf.py : no need to convert tensors twice klosax 2023-08-12 21:45:45 +02:00
  • afacdfe83a install ggml-meta.metal if LLAMA_METAL Kolen Cheung 2023-07-29 21:09:10 +01:00
  • 60d540831b gguf : roper closing of file M. Yusuf Sarıgöz 2023-08-12 21:42:31 +03:00
  • f857820e5d
    Fix MSVC compiler error vxiiduu 2023-08-13 01:09:26 +10:00
  • 8847e95725
    clean away unnecessary preprocessor conditional vxiiduu 2023-08-13 00:32:52 +10:00
  • b684583f0c
    Update llama-util.h vxiiduu 2023-08-13 00:23:36 +10:00
  • 202eab04d3 gguf : quantization is working M. Yusuf Sarıgöz 2023-08-12 16:39:05 +03:00
  • 1fc3d30b71 gguf : start implementing quantization (WIP) M. Yusuf Sarıgöz 2023-08-12 16:09:47 +03:00
  • fa7c39540c gguf : start implementing quantization (WIP) M. Yusuf Sarıgöz 2023-08-12 15:55:58 +03:00
  • ac27ac75ac
    Enhance Windows 7 compatibility. vxiiduu 2023-08-12 22:19:16 +10:00
  • 4754bc22b6 Add --cfg-negative-prompt-file option for examples KerfuffleV2 2023-08-12 05:39:02 -06:00
  • b2571af255 gguf : start implementing quantization (WIP) M. Yusuf Sarıgöz 2023-08-12 14:28:17 +03:00
  • 546aae99a4 CUDA: Fixed OpenLLaMA 3b mmq, reduced compile time JohannesGaessler 2023-08-11 19:59:33 +02:00
  • c4f02b4f74 gguf : start implementing quantization (WIP) M. Yusuf Sarıgöz 2023-08-12 12:01:17 +03:00
  • 0e1a3c7e7d gguf : start implementing quantization (WIP) M. Yusuf Sarıgöz 2023-08-12 11:32:34 +03:00
  • 1132941cb3 Fix descriptor set pre-allocation assert 0cc4m 2023-08-12 10:22:58 +02:00
  • 7ac00def7b Remove unnecessary cblas link 0cc4m 2023-08-12 10:22:38 +02:00
  • 9483288e03 Merge branch 'master' into concedo_experimental Concedo 2023-08-12 16:04:11 +08:00
  • 641561058b
    gfx1100 support Henri Vasserman 2023-08-12 10:51:46 +03:00
  • 4fa017a1f9 gguf : start implementing quantization (WIP) M. Yusuf Sarıgöz 2023-08-12 10:40:56 +03:00
  • 8ff0398be7 server : generate .hpp jhen 2023-08-12 13:15:41 +08:00
  • 186c496fdf Merge branch 'gguf' of https://github.com//ggerganov/llama.cpp into gguf M. Yusuf Sarıgöz 2023-08-12 07:25:10 +03:00
  • 2f52008b20 gguf : rm references to old file magics M. Yusuf Sarıgöz 2023-08-12 07:24:46 +03:00
  • 3409735cff server : skip byte pair in display probabilites jhen 2023-08-12 12:06:40 +08:00
  • 55a86adc24 server : implement grammer param in the UI jhen 2023-08-12 10:41:56 +08:00
  • c58ff992dc server : add grammar support in chat.mjs jhen 2023-08-12 10:15:31 +08:00
  • 5d79fbcc4d server : implement json-schema-to-grammar.mjs by follow python impl jhen 2023-08-12 10:02:30 +08:00
  • a2b16d6172
    Allow for metal development in nix package William Behrens 2023-08-11 20:45:33 -05:00
  • b19edd54d5
    Adding support for llama2.c models (#2559) master-b19edd5 byte-6174 2023-08-11 19:17:25 -04:00
  • 53dc399472
    server: fixed wrong variable name in timing json (#2579) master-53dc399 Equim 2023-08-12 06:35:14 +08:00
  • e76c59d524
    Update gptneox-main.cpp klosax 2023-08-11 23:09:49 +02:00
  • 2a5ac7af44
    Update gguf_tensor_map.py klosax 2023-08-11 23:08:48 +02:00
  • e732423280 gguf : get rid of n_mult, read n_ff from file M. Yusuf Sarıgöz 2023-08-11 23:50:38 +03:00
  • fc60a27642
    ci: add linux binaries to release build ci_cublas_linux-fc60a27 Green Sky 2023-05-05 00:01:30 +02:00
  • f44bbd3d88 gguf : rm redundant method M. Yusuf Sarıgöz 2023-08-11 21:00:51 +03:00
  • 7009cf581c gguf : shorter name for member variable M. Yusuf Sarıgöz 2023-08-11 20:43:02 +03:00
  • 61919c1a8f gguf : rm references to old file formats M. Yusuf Sarıgöz 2023-08-11 20:36:11 +03:00
  • d09fd10713 gguf : write metadata in gguf_file_saver M. Yusuf Sarıgöz 2023-08-11 20:07:43 +03:00
  • 781b9ec3f5 gguf : write metadata in gguf_file_saver (WIP) M. Yusuf Sarıgöz 2023-08-11 18:01:26 +03:00
  • 28abfc90fa gguf : write metadata in gguf_file_saver (WIP) M. Yusuf Sarıgöz 2023-08-11 13:27:58 +03:00
  • e3a4960953 gguf : add gguf_get_kv_type M. Yusuf Sarıgöz 2023-08-11 13:03:23 +03:00
  • eb8ca6996f gguf : add gguf_get_kv_type M. Yusuf Sarıgöz 2023-08-11 12:24:08 +03:00
  • b2440f1943 gguf : start implementing gguf_file_saver (WIP) M. Yusuf Sarıgöz 2023-08-11 11:29:50 +03:00
  • a356b0e228 gguf : start implementing gguf_file_saver (WIP) M. Yusuf Sarıgöz 2023-08-11 10:50:02 +03:00
  • 4e58a05249
    Allow overriding CC_TURING Henri Vasserman 2023-08-11 10:16:02 +03:00
  • b815e97c3d
    Merge 'origin/master' into hipblas Henri Vasserman 2023-08-11 10:00:07 +03:00
  • dae9dffa6a rename koboldcpp.dll to koboldcpp_default.dll Concedo 2023-08-11 14:54:27 +08:00
  • e7d346c37c gguf : start implementing gguf_file_saver (WIP) M. Yusuf Sarıgöz 2023-08-11 09:52:01 +03:00
  • c299c4ac0d
    New __dp4a assembly Engininja2 2023-08-11 09:43:14 +03:00
  • e6b6ae55f4
    Undo mess Henri Vasserman 2023-08-11 09:30:28 +03:00
  • a07f603a3e Replace vk::QueueFamilyIgnored with VK_QUEUE_FAMILY_IGNORED to support more Vulkan header versions 0cc4m 2023-08-11 05:29:26 +02:00
  • 23e0eba66b
    git wasn't needed and didn't do anything William Behrens 2023-08-10 21:46:42 -05:00
  • b19bf60881
    Merge branch 'ggerganov:master' into master William Behrens 2023-08-10 21:43:23 -05:00
  • 084ee1b21a
    copy build info to output William Behrens 2023-08-10 21:43:17 -05:00
  • 582ba1b478 metal : return null if load pipeline failed jhen 2023-08-11 07:22:47 +08:00
  • 400dcced7e
    Merge branch 'ggerganov:master' into master Eve 2023-08-10 17:42:13 -04:00
  • 9ca4abed89
    Handle ENABLE_VIRTUAL_TERMINAL_PROCESSING more gracefully on earlier versions of Windows. master-9ca4abe DannyDaemonic 2023-08-10 13:11:36 -07:00
  • f316b94c7c gguf : rm deprecated function M. Yusuf Sarıgöz 2023-08-10 20:20:22 +03:00
  • cfb8e35b73 gguf : inference with 7B model working (WIP) M. Yusuf Sarıgöz 2023-08-10 19:56:56 +03:00
  • 52801c055d
    Merge pull request #1 from jrudolph/convert-llama2-vocab byte-6174 2023-08-10 12:33:02 -04:00
  • 212500e454 remove redunct entry Equim 2023-08-11 00:26:18 +08:00
  • f7de84bb8c server: fixed wrong variable name in timing json Equim 2023-08-10 23:59:00 +08:00
  • 42cc04d11d gguf : calculate n_mult M. Yusuf Sarıgöz 2023-08-10 18:49:08 +03:00
  • 22de6c5c4c upd .gitignore M. Yusuf Sarıgöz 2023-08-10 18:09:49 +03:00
  • 4c0f64e302 rm binary commited by mistake M. Yusuf Sarıgöz 2023-08-10 18:07:41 +03:00
  • 4f865181aa gguf : start implementing libllama in GGUF (WIP) M. Yusuf Sarıgöz 2023-08-10 17:49:31 +03:00
  • aa26201291
    also support loading from llama2.c vocabulary Johannes Rudolph 2023-08-10 16:32:44 +02:00
  • e59fcb2bc1
    Add --n-predict -2 for stopping generation on full context (#2565) master-e59fcb2 Christian Demsar 2023-08-10 10:28:27 -04:00
  • d2b95e7e70
    refactor vocab loading into its own method Johannes Rudolph 2023-08-10 16:17:26 +02:00
  • 886f4eed79 updated lite, up ver, remove bell Concedo 2023-08-10 22:01:33 +08:00
  • aab15de466 commandline argument changes for clarity. Aniket 2023-08-10 09:53:21 -04:00
  • 1c4d8bf981 gguf : start implementing libllama in GGUF (WIP) M. Yusuf Sarıgöz 2023-08-10 16:52:08 +03:00
  • db5d7ab3f7 Adding more information in the README to use conversion tool. Aniket 2023-08-10 09:49:14 -04:00
  • 1638757767
    Fix grammar-based sampling issue in server (#2566) master-1638757 Martin Krasser 2023-08-10 12:16:38 +02:00
  • 42e055d9d6
    ws fix Henri Vasserman 2023-08-10 12:14:40 +03:00
  • f41920e3a9
    AMD assembly optimized __dp4a Engininja2 2023-08-10 12:11:27 +03:00
  • 29a59b5f07
    Fix merge Henri Vasserman 2023-08-10 12:09:28 +03:00
  • c5f5209d37 globalize args Concedo 2023-08-10 16:30:02 +08:00
  • 2c8e92044e Merge remote-tracking branch 'elsagranger/master' Laura 2023-08-10 07:59:55 +02:00
  • 996072c250 metal : return null instead of exit(1) jhen 2023-08-10 08:45:50 +08:00
  • 01f45e1c87 manual merge with llama.cpp master netrunnereve 2023-08-09 19:54:42 -04:00
  • 8f8ab6c4c0
    hipLDFLAG Path change Unix to multisystem in Makefile YellowRoseCx 2023-08-09 18:05:03 -05:00
  • acea8e10a3 examples/main: Add --prompt-cache-clobber parameter crasm 2023-08-07 21:12:43 -04:00
  • 610ba4cfc4
    Merge 'origin/master' into hipblas Henri Vasserman 2023-08-09 23:54:58 +03:00
  • 916a9acdd0
    ggml-alloc: Don't try to re-use buffers of external tensors (#2562) master-916a9ac Sam Spilsbury 2023-08-09 23:47:42 +03:00
  • ea04a4ca19
    add log_callback to llama_context_params for custom logging. (#2234) master-ea04a4c grahameth 2023-08-09 22:46:40 +02:00
  • b810424edf
    ggml-alloc: >= when checking for out-of-bounds Sam Spilsbury 2023-08-09 23:33:33 +03:00
  • 198f162065
    add missing git dependency to flake.nix William Behrens 2023-08-09 11:53:20 -05:00
  • 6309f7500c
    output build-info.h into cmake_current_binary_dir for easier packaging William Behrens 2023-08-09 11:50:04 -05:00
  • d9b5744de0
    add build-info.h to flake post install William Behrens 2023-08-09 11:37:45 -05:00