Commit graph

  • 5e6358398c Moving Block_release to the deallocation code Paul Tsochantaris 2024-10-07 12:09:16 +01:00
  • 2bd826de0a
    metal : minor [no ci] Georgi Gerganov 2024-10-07 11:50:41 +03:00
  • 70ff50d753
    metal : avoid reference of device context in the backend context Georgi Gerganov 2024-10-07 11:46:34 +03:00
  • 34e0e6eae4
    metal : g_state -> g_ggml_ctx_dev_main [no ci] Georgi Gerganov 2024-10-07 10:52:54 +03:00
  • 1bd5018c63
    metal : minor fix [no ci] Georgi Gerganov 2024-10-07 10:49:25 +03:00
  • 5f71096e47
    metal : avoid unnecessary singleton accesses Georgi Gerganov 2024-10-07 10:23:44 +03:00
  • 332506910f tool-call: accept {"type": "function", "name": "fn" for llama 3.x ochafik 2024-10-07 02:23:37 +01:00
  • 241acc2488 agent: disable brave_search when BRAVE_SEARCH_API_KEY unset ochafik 2024-10-07 02:22:52 +01:00
  • b150ffad41 fix merge slaren 2024-10-07 01:10:35 +02:00
  • e7cbdcf0a2 ggml: Add POOL2D OP for GPU ACC to the Vulkan. Changyeon Kim 2024-10-06 23:31:27 +09:00
  • 044fc38bd0
    Merge branch 'ggerganov:master' into vlm Changyeon Kim 2024-10-06 22:50:30 +09:00
  • 9923342c27
    Update sampling.cpp Vignesh Skanda 2024-10-06 18:58:55 +05:30
  • f3d40e42f7
    Update requirements-pydantic.txt Vignesh Skanda 2024-10-06 18:53:30 +05:30
  • d5cb86844f
    contrib : simplify + minor edits [no ci] Georgi Gerganov 2024-10-06 14:15:27 +03:00
  • 39940e5fa3
    Algorithm rework MaggotHATE 2024-10-06 16:15:12 +05:00
  • 094caea359
    Merge branch 'ggerganov:master' into master MaggotHATE 2024-10-06 16:06:32 +05:00
  • f4b2dcdf49
    readme : fix typo [no ci] Georgi Gerganov 2024-10-06 13:49:41 +03:00
  • 6dcb899170
    metal : fix build when MTLGPUFamilyApple3 is not available Georgi Gerganov 2024-10-06 13:16:18 +03:00
  • 4b161bc673
    metal : fix indent Georgi Gerganov 2024-10-06 13:10:35 +03:00
  • 5ea66f4354
    fixes slaren 2024-10-06 00:37:25 +02:00
  • 4ef1b017af
    llama : adapt to backend changes Georgi Gerganov 2024-10-04 17:35:58 +03:00
  • c080e92e75
    cont : alternative initialization of global objects Georgi Gerganov 2024-10-04 15:44:14 +03:00
  • 2e7e05c09b
    metal : global registry and device instances Georgi Gerganov 2024-10-04 15:10:28 +03:00
  • 2d8c2c79ca
    metal : fix names [no ci] Georgi Gerganov 2024-10-04 14:31:00 +03:00
  • 621460063e
    ggml : add metal backend registry / device Georgi Gerganov 2024-10-02 11:14:38 +03:00
  • b6d6c5289f
    sync : llama.cpp b3889 Georgi Gerganov 2024-10-06 12:53:28 +03:00
  • b0915d5b51
    vulkan : retry allocation with fallback flags (whisper/2451) SRHMorris 2024-10-06 08:34:20 +01:00
  • f5c35c109c flake.lock: Update github-actions[bot] 2024-10-06 00:22:40 +00:00
  • e179dd4929 cmake : link dl explicitly for Android Andrew Minh Nguyen 2024-10-01 13:21:56 -05:00
  • fa049cd6e3 docs : add cross-compiling for Android Andrew Minh Nguyen 2024-09-27 13:36:23 -05:00
  • 63e60deda3
    Swapped sorting for a custom algorithm MaggotHATE 2024-10-05 23:27:36 +05:00
  • 49a2fd0e28 docs : update building Android on Termux Andrew Minh Nguyen 2024-09-27 13:05:25 -05:00
  • 2f8652383d docs : clarify building Android on Termux Andrew Minh Nguyen 2024-09-26 21:03:26 -05:00
  • 59e8e63e68
    Merge branch 'ggerganov:master' into master MaggotHATE 2024-10-05 21:51:52 +05:00
  • 8c475b97b8
    rerank : use [SEP] token instead of [BOS] (#9737) b3887 Georgi Gerganov 2024-10-05 15:55:04 +03:00
  • 58b16695e1
    sync : ggml b3886 Georgi Gerganov 2024-10-05 15:53:49 +03:00
  • 905f5485b2
    metal : zero-init buffer contexts (whisper/0) Georgi Gerganov 2024-10-05 14:33:54 +03:00
  • 6ada2e7cfc
    ci : add shebang to run.sh Georgi Gerganov 2024-10-05 15:03:02 +03:00
  • 7403c05c06 Single allocation of encode_async block with non-ARC capture in ggml-metal.m Paul Tsochantaris 2024-10-05 02:00:59 +01:00
  • aa23425236 Fix Vit & Patch merging root 2024-10-05 00:41:54 +00:00
  • e41a5403a0 reinstate trailing \n ochafik 2024-10-05 01:15:48 +01:00
  • 52c5a6244f server: fix disconnection logic in test (before post response headers) ochafik 2024-10-05 00:44:13 +01:00
  • 6f693f14b0 server: use (new) Request::is_alive as set_content_provider called after status / headers sent ochafik 2024-10-04 23:51:58 +01:00
  • 56e149d627 add quantize method Yutong Dai 2024-10-04 22:38:39 +00:00
  • 74f657cc24
    Fixed broken randomization MaggotHATE 2024-10-04 23:47:19 +05:00
  • 899e0732ee
    Merge branch 'ggerganov:master' into master MaggotHATE 2024-10-04 23:46:03 +05:00
  • 49cd2118e0
    Moved min_keep MaggotHATE 2024-10-04 23:35:47 +05:00
  • 2dc708c72a Delete tests/.DS_Store ochafik 2024-10-04 19:32:46 +01:00
  • 71967c2a6d
    Add Llama Assistant (#9744) Viet-Anh NGUYEN (Andrew) 2024-10-05 01:29:35 +07:00
  • 6d94ba2e58
    Fixed forgotten header MaggotHATE 2024-10-04 22:51:04 +05:00
  • 6297d9fec1 server: avoid calling sink.is_alive() after it died 🧟‍♂️ ochafik 2024-10-04 18:45:09 +01:00
  • 4f8e55b170
    Fixed RNG to be reproduceable MaggotHATE 2024-10-04 22:38:12 +05:00
  • a3c1e09186
    Add Llama Assistant Viet-Anh NGUYEN (Andrew) 2024-10-04 23:44:02 +07:00
  • f2a2a618a2
    Fixed trailing backspaces MaggotHATE 2024-10-04 21:42:54 +05:00
  • d9c9203a0b
    Merge branch 'ggerganov:master' into master MaggotHATE 2024-10-04 21:35:23 +05:00
  • 41e16654bd
    First fixes by comments MaggotHATE 2024-10-04 21:34:31 +05:00
  • 17880771ad
    sync : ggml b3883 Georgi Gerganov 2024-10-04 18:50:25 +03:00
  • 55951c018d
    ggml : fix typo in example usage ggml_gallocr_new (ggml/984) Daniel Bevenius 2024-10-04 15:46:18 +02:00
  • ff565769f2
    ggml : fixes after sync (ggml/983) Diego Devesa 2024-10-04 08:41:40 +02:00
  • db54ac5df4
    Simplified chances calculation MaggotHATE 2024-10-04 18:30:46 +05:00
  • 9455194056
    Cleanup MaggotHATE 2024-10-04 17:53:13 +05:00
  • 89640b00a1
    Initial XTC commit MaggotHATE 2024-10-04 17:51:27 +05:00
  • 16ff502214 server: mime nit ochafik 2024-10-04 13:24:15 +01:00
  • 03efb92fde server: support cancellation of prompt processing ochafik 2024-10-04 13:22:57 +01:00
  • f3fdcfaa79
    ci : fine-grant permission (#9710) b3880 Xuan Son Nguyen 2024-10-04 11:47:19 +02:00
  • e51973f21e
    ci : adjust rank score interval Georgi Gerganov 2024-10-04 12:06:58 +03:00
  • 9e897d4439
    common : sanity check for non-NULL tokens Georgi Gerganov 2024-10-04 12:04:54 +03:00
  • 133c7b46b3
    Fixed RNG seed docs (#9723) b3879 Daniel Kleine 2024-10-04 10:54:44 +02:00
  • 1ba3df3de5
    rerank : use [SEP] token instead of [BOS] Georgi Gerganov 2024-10-04 11:54:32 +03:00
  • 42f546500f server: introduce supposedly lighterweight is_alive in httplib (https://github.com/yhirose/cpp-httplib/pull/1956) ochafik 2024-10-04 05:18:39 +01:00
  • 43e306e08f server: fix error status ochafik 2024-10-04 04:26:11 +01:00
  • a151ddcd5a agent: handle function errors and dont' stringify str outputs ochafik 2024-10-04 04:06:00 +01:00
  • d6b86bea25
    Merge branch 'master' into vlm Changyeon Kim 2024-10-04 10:52:22 +09:00
  • 88d26f1f1e ggml: Add POOL2D OP for GPU ACC to the Vulkan. Changyeon Kim 2024-10-04 10:43:51 +09:00
  • fa5b31a5ca vulkan : add GGML_VK_FORCE_HEAP_INDEX env var Yifan Gu 2024-10-03 23:14:28 +00:00
  • e28cdd78b0 quantization script added Yutong Dai 2024-10-03 22:08:36 +00:00
  • 21a3c90a1c agent: tool tweaks (remove ansi escapes from python output, update env keys + provider docs) Olivier Chafik 2024-10-03 22:20:34 +01:00
  • 03d1a23256 changed print format to unsigned Daniel Kleine 2024-10-03 22:57:37 +02:00
  • 366efc8a18 tool-call: fix llama 3.x tc parsing when there are spaces before "name" Olivier Chafik 2024-10-03 21:46:41 +01:00
  • 6a262b62f0
    Don't use a specific version for the package (CMake throws and error) Uglješa Lukešević 2024-10-03 21:00:22 +02:00
  • da02397f7f agent: support more providers (+ extract serve_tools_inside_docker.sh) Olivier Chafik 2024-10-03 19:18:47 +01:00
  • b4fc1e8ba7 tool-call: adjust triggers to most common tool call variations from Llama-3.1-8B and Llama-3.2-3B Olivier Chafik 2024-10-03 19:17:32 +01:00
  • ece12b074f antiprompts: ensure partial match is at end of string (or else server stops sending replies) Olivier Chafik 2024-10-03 19:10:21 +01:00
  • d5ed2b929d
    metal : remove abort (skip) (ggml/0) b3878 Georgi Gerganov 2024-10-03 21:18:19 +03:00
  • 1bb8a64ebf
    sync : ggml Georgi Gerganov 2024-10-03 21:17:49 +03:00
  • fabdc3bda3
    ggml/ex: calculate accuracy in graph, adapt MNIST (ggml/980) Johannes Gäßler 2024-10-03 17:29:59 +02:00
  • eee39bdc96
    ggml: refactor cross entropy loss CPU impl. (ggml/976) Johannes Gäßler 2024-10-02 15:32:39 +02:00
  • a88c0d5f26 wip Xuan Son Nguyen 2024-10-03 20:15:36 +02:00
  • 5d5ab1e5cc
    metal : fix compute pass descriptor autorelease crash (#9718) b3874 Jack Mousseau 2024-10-03 11:01:46 -07:00
  • a7ad553513
    ggml-backend : add device description to CPU backend (#9720) b3873 Diego Devesa 2024-10-03 17:39:18 +02:00
  • d6fe7abf04
    ggml: unify backend logging mechanism (#9709) b3872 bandoti 2024-10-03 12:39:03 -03:00
  • d436f5ba2c [metal] (HACK!!!) force use kernel_flash_attn_ext_scalar_f16 in FA Shupei Fan 2024-09-24 12:33:32 +08:00
  • 9e62e7e10e [metal-kernel] add flash_attn_ext_scalar_f16 implementation Shupei Fan 2024-09-22 22:03:18 +08:00
  • e3c355ba65
    convert : handle tokenizer merges format from transformers 4.45 (#9696) compilade 2024-10-03 10:22:15 -04:00
  • 0a35c3c460 Remove cuda log statement Mason M 2024-10-03 08:37:04 -03:00
  • 83a9a98543 ggml-backend : add device description to CPU backend slaren 2024-10-03 13:16:03 +02:00
  • 19eb9ec800 Remove log callbacks from ggml backends Mason M 2024-10-03 07:25:38 -03:00
  • 2a016d9c21 Support for Minerva 7B Riccardo Orlando 2024-10-03 12:03:53 +02:00
  • 841713e1e4
    rpc : enable vulkan (#9714) b3870 Radoslav Gerganov 2024-10-03 13:00:52 +03:00
  • 9d9a7eeb7f Merge remote-tracking branch 'origin/master' into ggml-logger-subsystem Mason M 2024-10-03 06:42:29 -03:00