Commit graph

  • 5cfabaee25 Merge branch 'master' into concedo_experimental Concedo 2023-10-15 15:50:20 +08:00
  • 11bff29045
    MPT : support GQA for replit-code-v1.5 (#3627) b1382 cebtenzzre 2023-10-15 02:32:06 -04:00
  • 3c91fb67ef
    feat: inference with alpaca prompt script added Jinho Heo 2023-10-15 14:24:07 +09:00
  • e3261ffad3 cleanup memory usage around clip_image_* Damian Stewart 2023-10-15 00:18:04 +02:00
  • 2c22221c55 MPT : support GQA for replit-code-v1.5 Cebtenzzre 2023-10-14 15:17:41 -04:00
  • 420744467f MPT : clone wte to output at load time Cebtenzzre 2023-10-14 14:38:28 -04:00
  • b577e6374e MPT : fix an offload func typo Cebtenzzre 2023-10-14 13:00:39 -04:00
  • 2847ecf2dd expose llava methods in libllama.dylib Damian Stewart 2023-10-14 19:15:35 +02:00
  • 09edb7ecdf get libllava to output in the right place Damian Stewart 2023-10-14 18:59:40 +02:00
  • 52a776740d
    Merge branch 'ggerganov:master' into master vvhg1 2023-10-14 18:38:52 +02:00
  • 708928c649 fix bug where base64 string was not removed from the prompt Damian Stewart 2023-10-14 18:24:55 +02:00
  • f21af512cd wip Damian Stewart 2023-10-14 17:20:21 +02:00
  • b9f533b997 move llava into its own subdir Damian Stewart 2023-10-14 17:11:32 +02:00
  • 4de5a2d473
    speculative : add tree-based sampling support Georgi Gerganov 2023-10-14 17:54:02 +03:00
  • 4e5c5c451c notify the user from server ui that multimodality is unavialable FSSRepo 2023-10-14 08:28:49 -04:00
  • f8eddcf8e8 collapse clip and llava libraries Damian Stewart 2023-10-14 13:13:40 +02:00
  • e2cd07cf87 move base64.hpp into common/ Damian Stewart 2023-10-14 13:03:13 +02:00
  • f83c0606bd further cleanup; move llava-cli into its own file and rename Damian Stewart 2023-10-14 12:58:40 +02:00
  • 11dc1091f6
    Honor -ngl option for Cuda offloading in llava (#3621) b1381 M. Yusuf Sarıgöz 2023-10-14 13:52:44 +03:00
  • 0889117573 cleanup Damian Stewart 2023-10-14 12:39:00 +02:00
  • c6932085fe refactor image load out of llava init Damian Stewart 2023-10-14 11:51:33 +02:00
  • e90a6515dd Fix vulkan shader fp32 name 0cc4m 2023-10-14 11:23:44 +02:00
  • 299f6b54d8 fix compilation errors with llvm Damian Stewart 2023-10-14 11:17:38 +02:00
  • ee652b2ab4 process escapes for neg prompt and interactive consec prompts vvhg1 2023-10-14 11:08:50 +02:00
  • 8eef958349
    Merge branch 'ggerganov:master' into master vvhg1 2023-10-14 10:56:24 +02:00
  • 35b10d149f Merge upstream changes, fix conflict 0cc4m 2023-10-14 10:51:53 +02:00
  • bd054470b9 Close file before deletion 0cc4m 2023-10-14 10:49:45 +02:00
  • 8224ca5775 wip refactor image loading Damian Stewart 2023-10-14 10:43:13 +02:00
  • 7efac61c7a Fix shader generator script Windows compatibility 0cc4m 2023-10-14 10:31:20 +02:00
  • 1e6e13f32b Clean up code 0cc4m 2023-10-14 10:30:39 +02:00
  • de4b813c5f Replace shaderc dependency with precompiled shaders 0cc4m 2023-10-14 09:55:08 +02:00
  • 73d01d14aa Add Python-based Vulkan shader generator 0cc4m 2023-10-14 08:46:16 +02:00
  • 7e64bfe060 refactor code + remove unused comments + improved README.md FSSRepo 2023-10-14 00:31:34 -04:00
  • 9f72b44635 add multimodal input - alfa FSSRepo 2023-10-13 23:36:32 -04:00
  • 9cddae2512
    Update README.md BarfingLemurs 2023-10-13 20:35:49 -04:00
  • 932589c0ef Honor -ngl option for Cuda offloading in llava llava-fix-offloading M. Yusuf Sarıgöz 2023-10-14 03:12:10 +03:00
  • de35b47908 fixed tokens probs FSSRepo 2023-10-13 19:55:25 -04:00
  • dfa380dff4
    Update README.md BarfingLemurs 2023-10-13 19:45:27 -04:00
  • 9d98cdda2c llava multimodal integration FSSRepo 2023-10-13 18:42:44 -04:00
  • 32fe1a58e3 train-text-from-scratch : fix assert failure in ggml-alloc slaren 2023-10-13 21:15:34 +02:00
  • eb08201227 add changes to README.md FSSRepo 2023-10-13 14:28:06 -04:00
  • a2c2d98c16 add context swap FSSRepo 2023-10-13 14:12:50 -04:00
  • b6d9e212e5 fixed timings per slot FSSRepo 2023-10-13 13:10:38 -04:00
  • a410a9e300 unused change reverted FSSRepo 2023-10-13 12:23:58 -04:00
  • 6358ae5f48 server ui now support multiple clients FSSRepo 2023-10-13 12:22:54 -04:00
  • 770dc9da0d add base64 in-prompt image support Damian Stewart 2023-10-13 17:53:17 +02:00
  • 4ba5a5013d chat.mjs support cached prompt + some fixes FSSRepo 2023-10-13 11:06:41 -04:00
  • 9ef91b13ea Merge branch 'master' of https://github.com/ggerganov/llama.cpp into ntkv2 Cebtenzzre 2023-10-13 09:58:07 -04:00
  • d13686107a
    Update README.md BarfingLemurs 2023-10-13 09:35:29 -04:00
  • 2a4bcbacea
    llama : remove n_threads from llama_decode_internal (#3614) b1380 Daniel Bevenius 2023-10-13 12:33:16 +02:00
  • 9f6818bdd7
    llama: remove n_threads from llama_decode_internal Daniel Bevenius 2023-10-13 12:22:07 +02:00
  • 424b6381c4
    ggml : add context enumeration functions (#3605) b1379 slaren 2023-10-13 12:23:10 +02:00
  • 3c10d9f3de add external llava API Damian Stewart 2023-10-13 12:21:44 +02:00
  • 0209d39526 wip llava python bindings compatibility Damian Stewart 2023-10-13 10:33:07 +02:00
  • 11b7f1e6f1 Update README.md add Aquila2. ldwang 2023-10-13 12:57:32 +08:00
  • 643902fbbb fixed tensor split save and load Concedo 2023-10-13 10:07:22 +08:00
  • 500ac7120e cached prompt support FSSRepo 2023-10-12 21:16:12 -04:00
  • 1c28116de4 dont add space when using special tokens staviq 2023-10-13 01:14:23 +02:00
  • 83c2b3553a grammar + no stream completion FSSRepo 2023-10-12 18:43:57 -04:00
  • 605e701cb4 Fixes suggested by @mmnga Galunid 2023-10-12 23:56:24 +02:00
  • a85229c4e9 ggml : add context enumeration functions finetune : fix assert failure in ggml-alloc slaren 2023-10-12 23:44:36 +02:00
  • 5b8e29de53 multiple client support FSSRepo 2023-10-12 17:09:12 -04:00
  • a3827779d7 Adapt Refact graph building KerfuffleV2 2023-10-12 14:41:28 -06:00
  • ae31a9a0b6 Fix kq_scale for Baichuan KerfuffleV2 2023-10-12 14:23:08 -06:00
  • 5a06711f64 Adapt baichuan (not tested yet) KerfuffleV2 2023-10-12 14:18:04 -06:00
  • 81484805f0 completion endpoint working FSSRepo 2023-10-12 16:17:27 -04:00
  • 40aa5c6f8a CLBlast: Fix temporary buffer size for f16 conversion (wsize) shibe2 2023-10-11 21:30:06 +04:00
  • 1e0e873c37
    CLBlast: Fix matrix-vector multiplication (#3544) b1378 shibe2 2023-10-12 23:59:47 +04:00
  • 06c278895f Fix incorrect offloading and norm_rms_eps value KerfuffleV2 2023-10-12 13:30:05 -06:00
  • 29c8cdd65d refactored sampling function FSSRepo 2023-10-12 15:02:19 -04:00
  • 5261aee8d8
    sampling : one sequence per sampling context rev-sampling Georgi Gerganov 2023-10-12 20:35:01 +03:00
  • b716eeb72a Merge branch 'master' of https://github.com/ggerganov/llama.cpp FSSRepo 2023-10-12 12:55:08 -04:00
  • 78504218b9 save dev progress FSSRepo 2023-10-12 12:51:48 -04:00
  • 7fc0250d15 1. check in ggml.c if endianess is not match 2. update GGUF version 3. change get_pack_prefix to property 4. update information log chenqiny 2023-10-13 00:23:16 +08:00
  • 5974d617c0 print pfx/sfx if verb, main: split pfx input sfx staviq 2023-10-12 17:49:15 +02:00
  • 370359e5ba
    examples: support LLaVA v1.5 (multimodal model) (#3436) b1377 M. Yusuf Sarıgöz 2023-10-12 18:23:18 +03:00
  • 0bd7e69de6 do not use Wno-cast-qual for MSVC M. Yusuf Sarıgöz 2023-10-12 17:20:22 +03:00
  • 0f1c569540 swift fix staviq 2023-10-12 16:17:09 +02:00
  • 9e24cc6e2e
    docs : fix typo GOMP_CPU_AFFINITY (#3597) uint256_t 2023-10-12 22:36:16 +09:00
  • e46a139859
    GOMP_CPU_AFFINITY is correct maekawatoshiki 2023-10-12 22:15:36 +09:00
  • 4bc5c9c5d5
    llava : code formatting, rename files, fix compile warnings Georgi Gerganov 2023-10-12 15:35:44 +03:00
  • 346e3c1605
    Merge branch 'master' into llava Georgi Gerganov 2023-10-12 15:13:48 +03:00
  • e9534ea665 fix typo M. Yusuf Sarıgöz 2023-10-12 15:03:25 +03:00
  • 5c6b2be11f
    llama : normalize code-style Georgi Gerganov 2023-10-12 14:47:29 +03:00
  • 04ac0558de
    Merge branch 'master' into HEAD Georgi Gerganov 2023-10-12 14:35:47 +03:00
  • 56ccf97b4a handle default n_predict M. Yusuf Sarıgöz 2023-10-12 14:34:53 +03:00
  • d28e572c02
    cmake : fix add_compile_options on macOS b1375 Georgi Gerganov 2023-10-12 14:31:05 +03:00
  • f3040beaab
    typo : it is --n-gpu-layers not --gpu-layers (#3592) Ian Scrivener 2023-10-12 22:10:50 +11:00
  • 1a8c8795d6
    ci : check if there is enough VRAM (#3596) Georgi Gerganov 2023-10-12 13:44:56 +03:00
  • 3d27302b4d
    ci : check if there is enough VRAM Georgi Gerganov 2023-10-12 13:38:07 +03:00
  • 7e2f714c9c tensor split only for cuda Concedo 2023-10-12 17:01:52 +08:00
  • 11b8f97c1e
    Tensor split UI (#471) Alexander Abushady 2023-10-12 04:50:21 -04:00
  • 601be78a3f kcpp does sampling ourselves, we can do whatever we want Concedo 2023-10-12 16:47:56 +08:00
  • a6c3dbc351 Merge branch 'master' into concedo_experimental Concedo 2023-10-12 16:32:00 +08:00
  • 8be043ee38 more horde optimizations Concedo 2023-10-12 16:20:52 +08:00
  • 0ae5b1aa62 removed trailing whitespace l3utterfly 2023-10-12 16:11:21 +08:00
  • c8159ead24 removed trailing whitespaces l3utterfly 2023-10-12 15:59:18 +08:00
  • 6f553d9183 added trailing newlines l3utterfly 2023-10-12 15:58:10 +08:00
  • cb58690579 reverted changes in ggml.c l3utterfly 2023-10-12 15:54:01 +08:00
  • cbf6d074b8 Merge branch 'master' into mem-fix-ios17 l3utterfly 2023-10-12 15:53:27 +08:00