Commit graph

  • 26d4efd11e
    sampling: fix top_k <= 0 (#5388) b2098 Johannes Gäßler 2024-02-08 09:46:30 +01:00
  • 7b996502e7
    Update llama.cpp Johannes Gäßler 2024-02-08 09:45:58 +01:00
  • 4e2bd8b01b
    llava: fix typo/formatting in README.md Daniel Bevenius 2024-02-08 09:42:08 +01:00
  • 61776c8870 ix error C2078: too many initializers with uint32x4_t for MSVC ARM64 Michael Podvitskiy 2024-02-08 09:39:32 +01:00
  • 0495811c9c
    Merge branch 'ggerganov:master' into master Michael Podvitskiy 2024-02-08 09:32:39 +01:00
  • 8504d2d0da
    tests : .gitignore obj files Georgi Gerganov 2024-02-08 09:46:47 +02:00
  • 0051c82d52 it runs; tokenization is messed up; pooling is wrong for multi batches Douglas Hanley 2024-02-08 01:02:10 -06:00
  • 37a147ebf9
    Clip: Bugfix for normalization (it did not loat the 3 std and mean values) Clip: bicubic resize function Clip: added save-to-bmp/pil for debugging and conversion from/to 32/8 images Clip: added normalization with FP16 precision simulation (image tensors match HF implementation, can be switched off, only used for llava-1.6) Clip: added newline tensor, mergetype kv, image-grid kv, new resize-pad function with resolution from gridpoints Clip: clip_image_preprocess now returns a float * vector instead of float, this way llava 1.5 and 1.6 is supported llava: added ggml cpu graph for embedding patching, added spatial_unpad preliminary support, added a lot of comments that need to be cleaned when all is final convert-image-encoder: fixed image-grid flattening John 2024-02-08 07:42:49 +01:00
  • b639e2a73f
    Bugfix printf tensor John 2024-02-08 03:00:00 +01:00
  • 2c1e8d3205
    Merge branch 'ggerganov:master' into cmp-nct-ggllm-tensor_printf_patch John 2024-02-08 02:54:02 +01:00
  • f156112f56
    Merge branch 'ggerganov:master' into master bmwl 2024-02-07 17:38:53 -08:00
  • 783b7ca02d Removing unneeded branch in server.cpp example and moving get_numa_affinity and making it static root 2024-02-07 22:28:29 +00:00
  • d47f232fc1 Removing last bit of MIRROR_MODE code for this PR root 2024-02-07 22:02:21 +00:00
  • 61c37ba93c Removing MIRROR_MODE code for this PR root 2024-02-07 21:46:19 +00:00
  • c4fbb6717c
    CMAKE_OSX_ARCHITECTURES for MacOS cross compilation (#5393) b2096 Michael Podvitskiy 2024-02-07 22:39:23 +01:00
  • 3eccea1b63 Syncing to pr root 2024-02-07 21:36:39 +00:00
  • 1d618a5efb arm intrinsics detection for msvc Michael Podvitskiy 2024-02-07 22:12:47 +01:00
  • 8c933b70c2
    fix typo in readme (#5399) Ebey Abraham 2024-02-07 21:11:30 +00:00
  • ef10d7867e merge from master Douglas Hanley 2024-02-07 15:01:05 -06:00
  • fc591358ec fix typo in readme Ebey Abraham 2024-02-07 20:57:15 +00:00
  • 05dd7171d7
    Merge branch 'ggerganov:master' into Val-patch-1 valiray 2024-02-07 14:39:51 -06:00
  • e0ab3c7e7c clean up CMAKE_SYSTEM_PROCESSOR checks Jared Van Bortel 2024-02-07 15:33:58 -05:00
  • 449585a498 n_considered configurable JohannesGaessler 2024-02-07 21:26:49 +01:00
  • fc57bdbde7 use STREQUAL instead of MATCHES Michael Podvitskiy 2024-02-07 21:24:30 +01:00
  • c43808c625 Fixed a number of issues with the move from BOOL to ggml_numa_strategies. Added a note about mirror mode note being implemented yet root 2024-02-07 19:49:07 +00:00
  • e390b22f57 fix, accept 2.8% JohannesGaessler 2024-02-07 20:46:06 +01:00
  • 6cecefd64c works (?), 2.28% accept JohannesGaessler 2024-02-07 20:38:28 +01:00
  • 1baba975e2 a better way to define x86 architecture Michael Podvitskiy 2024-02-07 20:15:17 +01:00
  • 1895235ae2
    a better way to define arm architecture Michael Podvitskiy 2024-02-07 20:13:01 +01:00
  • 1574279273 partial implementation JohannesGaessler 2024-02-07 20:11:12 +01:00
  • 4ce0211639 partial implementation JohannesGaessler 2024-02-07 20:04:44 +01:00
  • b906596bb7
    Add Ava in the list of llama.cpp UIs (#4362) b2094 Kamil Tomšík 2024-02-07 19:44:52 +01:00
  • 1d6059a5e2 count token combinations JohannesGaessler 2024-02-07 19:44:38 +01:00
  • 6d47013d81 read static_input_file JohannesGaessler 2024-02-07 19:19:39 +01:00
  • 34eb6a33cc
    Merge branch 'master' into patch-1 Kamil Tomšík 2024-02-07 19:10:28 +01:00
  • 51b7317a65 random choice JohannesGaessler 2024-02-07 18:55:37 +01:00
  • 617ae42dd4 initial commit JohannesGaessler 2024-02-07 18:48:37 +01:00
  • 5b0cec5ca6 update for flake8 lint vincent 2024-02-07 23:21:56 +08:00
  • d56b638385 fix: undo HF models permute tensor vincent 2024-02-07 23:04:06 +08:00
  • 526d517256 to align with the same order as convert.py for model write vincent 2024-02-07 23:03:07 +08:00
  • e4bd73c943 fix bug for norm_rms_eps missing vincent 2024-02-07 23:01:44 +08:00
  • 17f202ab7a CMAKE_OSX_ARCHITECTURES for MacOS cross compilation Michael Podvitskiy 2024-02-07 15:47:55 +01:00
  • 9a60f1b7bf improve array code hazelnutcloud 2024-02-07 21:59:51 +08:00
  • 82382ef3c2 fix error C2078: too many initializers with uint32x4_t for MSVC ARM64 Michael Podvitskiy 2024-02-07 14:20:24 +01:00
  • 2f8e6078b0 sampling: fix top_k <= 0 JohannesGaessler 2024-02-07 13:19:00 +01:00
  • aa7ab99be2
    CUDA: fixed mmvq kernel for bs 2,3,4 and -sm row (#5386) b2093 Johannes Gäßler 2024-02-07 12:40:26 +01:00
  • beed7faefa
    Merge branch 'ggerganov:master' into master hsnmkls 2024-02-07 19:13:39 +08:00
  • 10afa6f1d1
    [SYCL] update install make by w64devkit (#5297) Neo Zhang Jianyu 2024-02-07 18:16:55 +08:00
  • 3b0a1a063b CUDA: fixed mmvq kernel for bs 2,3,4 and -sm row JohannesGaessler 2024-02-07 10:18:31 +01:00
  • 914922d27e
    fix bug make prompt with image always being default DrewZt 2024-02-07 16:44:15 +08:00
  • 0ef46da632
    llava-cli : always tokenize special tokens (#5382) b2091 Xiao-Yong Jin 2024-02-07 02:17:25 -06:00
  • ee1628bdfe
    Basic Vulkan Multi-GPU implementation (#5321) b2090 0cc4m 2024-02-07 07:54:50 +01:00
  • ed0bf32290
    readme : modernize (#5379) Eve 2024-02-07 06:21:30 +00:00
  • 9a697d842b
    readme : update ui list (#5354) Ben Williams 2024-02-06 22:16:48 -08:00
  • 316c7faf77
    llama : add MiniCPM support (#5346) b2087 runfuture 2024-02-07 14:15:56 +08:00
  • f3e2b4fa3f
    server : update /props with "total_slots" value (#5373) b2086 Justin Parker 2024-02-07 01:15:19 -05:00
  • b775972532 llava-cli: use the escape CLI argument, remove incomplete separate escaping process Xiao-Yong Jin 2024-02-06 23:13:17 -06:00
  • 876b98eca5 llava-cli: tokenize special tokens in prompt Xiao-Yong Jin 2024-02-06 23:07:27 -06:00
  • f68664ac24
    convert : fix TypeError on GPT-2 vocab.json (#5288) Sang-Kil Park 2024-02-07 13:28:00 +09:00
  • 583cb8d540
    Update README.md Eve 2024-02-07 02:44:58 +00:00
  • 6f2014a029
    recommend Q4_K_M quantization method Eve 2024-02-07 02:24:26 +00:00
  • 0eff982f61
    make build instructions generic Eve 2024-02-07 02:19:33 +00:00
  • 3ff93c9c3a
    Delete SHA256SUMS Eve 2024-02-07 01:50:37 +00:00
  • e19eeff80e
    first cleanup, update everything to Llama 2 and remove outdated content Eve 2024-02-07 01:49:04 +00:00
  • 390996d84c Minor change of the constant names for minicpm vincent 2024-02-07 08:47:04 +08:00
  • 425ae7401f
    Merge branch 'ggerganov:master' into master hsnmkls 2024-02-07 07:31:47 +08:00
  • 0c02642d03 build.zig add macos Hasan Mukhlis 2024-02-07 07:26:17 +08:00
  • 12789eb308 Reverting Makefile root 2024-02-06 22:45:21 +00:00
  • 7aa974de5e Added numa options to allow finer grained control as well as plumbing for a new mirror mode that will require numa.h root 2024-02-06 22:43:13 +00:00
  • 60b80b0e8a removed trailing whitespace root 2024-02-06 22:27:38 +00:00
  • a69d6e2b91 Removed sched.h from ggml.h, moved ggml_get_numa_affinity into ggml.c, removed trailing whitespace and fixed up a few inconsistent variables root 2024-02-06 22:23:34 +00:00
  • 7286b83d3f BERT WIP ceb/bert Jared Van Bortel 2024-02-06 17:03:12 -05:00
  • 9db65d2d0e update /props endpoint section Justin Parker 2024-02-06 16:58:44 -05:00
  • b22e92509e Rename cpu assist free function 0cc4m 2024-02-06 22:54:36 +01:00
  • 2ea831b212 remove num_slots from default_generation_settings_for_props Justin Parker 2024-02-06 14:56:42 -05:00
  • dc327c694e Merge remote-tracking branch 'upstream/master' Justin Parker 2024-02-06 14:53:57 -05:00
  • 6ada782e80 update /props endpoint docs with total_slots Justin Parker 2024-02-06 14:53:33 -05:00
  • 4ff37ea41f cleanup total_slots return value in /props endpoint Justin Parker 2024-02-06 14:53:02 -05:00
  • 22b0e207dc
    Host sample updated for IPv4+IPv6. JohnnyB 2024-02-06 19:20:43 +00:00
  • fd8351b9dc Rework backend memory management to make sure devices and buffers get properly allocated and freed 0cc4m 2024-02-06 20:16:04 +01:00
  • 213d1439fa
    server : remove model.json endpoint (#5371) b2084 Alexey Parfenov 2024-02-06 18:08:38 +00:00
  • 10dd2c5e22
    server: remove model.json endpoint ZXED 2024-02-06 20:40:05 +03:00
  • 17c97fb062
    CUDA: mul_mat_vec_q max. batch size 8 -> 4 (#5370) b2083 Johannes Gäßler 2024-02-06 18:43:06 +01:00
  • 4e1d68b393 CUDA: mul_mat_vec_q max. batch size 8 -> 4 JohannesGaessler 2024-02-06 18:34:19 +01:00
  • b08f22c882
    Update README.md (#5366) b2082 Kawrakow 2024-02-06 19:00:16 +02:00
  • f57fadc009
    Slight quantization improvement for Q4_K and Q5_K (#5361) b2081 Kawrakow 2024-02-06 17:28:02 +02:00
  • 238af6e4f2
    Update README.md Kawrakow 2024-02-06 17:26:29 +02:00
  • a9353ecd97 constants expanded for minicpm vincent 2024-02-06 22:16:20 +08:00
  • 2e9c0bd6b3
    readme : add phi, orion 14b, internlm2, and yi-VL to readme (#5362) BarfingLemurs 2024-02-06 09:06:48 -05:00
  • 2c516611f1
    CUDA: mul_mat_vec_q for batch sizes > 1 (#5351) b2079 Johannes Gäßler 2024-02-06 14:44:06 +01:00
  • 088b6ba763
    add phi, orion 14b, internlm2, and yi-VL to readme BarfingLemurs 2024-02-06 08:18:50 -05:00
  • 279a1d7448 working vulkan zig build hazelnutcloud 2024-02-06 20:36:50 +08:00
  • dbb795b995 CUDA: mul_mat_vec_q for batch sizes > 1 JohannesGaessler 2024-02-05 16:17:54 +01:00
  • 16ecbc9a02
    Merge branch 'ggerganov:master' into master hsnmkls 2024-02-06 17:24:00 +08:00
  • 8a79c591de
    server : include total "num_slots" in props endpoint (#5349) b2078 Justin Parker 2024-02-06 04:20:59 -05:00
  • 31e7903221
    server : add dynatemp_range and dynatemp_exponent (#5352) b2077 Michael Coppola 2024-02-06 04:20:00 -05:00
  • bea82a059c
    fix align Abhilash Majumder 2024-02-06 14:47:21 +05:30
  • 592e4519bb Fixed include root 2024-02-06 09:10:55 +00:00
  • 65792fa407 Reverted Makefile root 2024-02-06 09:08:57 +00:00
  • cef47e44c9
    branch Abhilash Majumder 2024-02-06 14:36:32 +05:30