Commit graph

  • 597bc152b2 llama.cpp : fix --leave-output-tensor for llama-quantize. drollings 2024-10-10 16:58:15 -05:00
  • 7eee341bee
    common : use common_ prefix for common library functions (#9805) b3906 Diego Devesa 2024-10-10 22:57:42 +02:00
  • cb61ecf463 Merge remote-tracking branch 'origin/master' into sl/rename-common-funcs slaren 2024-10-10 20:48:56 +02:00
  • 0e9f760eb1
    rpc : add backend registry / device interfaces (#9812) b3905 Diego Devesa 2024-10-10 20:14:55 +02:00
  • cf8e0a3bb9
    musa: add docker image support (#9685) b3904 R0CKSTAR 2024-10-11 02:10:37 +08:00
  • 72db625bd4 Added XTC to server UIs MaggotHATE 2024-10-10 22:59:23 +05:00
  • 6bafc50650 Merge remote-tracking branch 'origin/master' into sl/rename-common-funcs slaren 2024-10-10 19:51:53 +02:00
  • c7499c557c
    examples : do not use common library in simple example (#9803) b3903 Diego Devesa 2024-10-10 19:50:49 +02:00
  • ba994a3a8a ggml_backend_rpc_start_rpc_server -> ggml_backend_rpc_start_server slaren 2024-10-10 19:48:37 +02:00
  • f7a383ffb3 Initial server support MaggotHATE 2024-10-10 21:48:49 +05:00
  • d150c7e309
    Update ggml/src/ggml-cuda/dmmv.cu agray3 2024-10-10 15:47:11 +01:00
  • 2107882cf5 Renamed parameters, fixed info and defaults MaggotHATE 2024-10-10 19:35:28 +05:00
  • 9f2fd8fd4f Removed silent event APIs exits OuadiElfarouki 2024-10-10 15:11:17 +01:00
  • d07dc44c63 addressed comment Alan Gray 2024-10-10 06:05:12 -07:00
  • 95c8b9c1b7 Vectorize load instructions in dmmv f16 CUDA kernel Alan Gray 2024-10-08 05:00:26 -07:00
  • ba29d31fb7
    Merge branch 'ggerganov:master' into master MaggotHATE 2024-10-10 11:42:50 +05:00
  • ed78de20e4 Merge branch 'master' into pr/8917 Nexesenex 2024-10-10 02:33:59 +02:00
  • 740e7cb6e5 llama : add llama_supports_rpc API slaren 2024-10-10 02:26:35 +02:00
  • e9f46640d8
    Merge 1440d445db into c81f3bbb05 MasterYi1024 2024-10-10 00:08:36 +01:00
  • c27f7f62f5
    Merge b1a8c244ce into c81f3bbb05 Richard Ulmer 2024-10-10 00:08:31 +01:00
  • ca71edfefc
    Merge 2793b863bb into c81f3bbb05 Herman Semenoff 2024-10-10 00:08:16 +01:00
  • 40ed42427c
    Merge 3578d09729 into c81f3bbb05 Zhenwei Jin 2024-10-10 00:07:58 +01:00
  • 3de720c630
    Merge f4f5b7ac56 into c81f3bbb05 Maximilian Winter 2024-10-10 00:07:42 +01:00
  • bc40adb1fa rpc : add backend registry / device interfaces slaren 2024-10-09 21:54:21 +02:00
  • 07ba8320fa
    minor : indent [no ci] Georgi Gerganov 2024-10-09 21:36:41 +03:00
  • 83a90c987c better names for common params fns slaren 2024-10-09 19:03:43 +02:00
  • c81f3bbb05
    cmake : do not build common library by default when standalone (#9804) b3902 Diego Devesa 2024-10-09 18:49:52 +02:00
  • df37af056c Using GGML_UNUSED instead of UNUSED OuadiElfarouki 2024-10-09 17:30:53 +01:00
  • 06444a603f add command line parser, simplify code slaren 2024-10-09 18:28:18 +02:00
  • 6ea0304b20 update android example slaren 2024-10-09 16:47:24 +02:00
  • e7022064ab
    perplexity : fix integer overflow (#9783) b3901 Georgi Gerganov 2024-10-09 17:00:18 +03:00
  • 4b44ea6224 enable common on android example slaren 2024-10-09 14:27:42 +02:00
  • 1c4d573c5f examples : do not use common library in simple example slaren 2024-10-09 14:16:01 +02:00
  • 3c0b8628cd rename gpt to common slaren 2024-10-09 14:07:04 +02:00
  • 0c18d216b4 fix general.license list to str momonga 2024-10-09 21:06:05 +09:00
  • 672438dce1 update common.cpp slaren 2024-10-09 13:46:01 +02:00
  • cd097710d7 update ngram-cache slaren 2024-10-09 12:50:19 +02:00
  • aee57d44c6 no longer necessary to disambiguate common functions with :: slaren 2024-10-09 12:46:18 +02:00
  • e58d3b1214 rename llama_arg slaren 2024-10-09 12:44:49 +02:00
  • 4f7e4b5e19 common : use common_ prefix for common library functions slaren 2024-10-09 12:26:27 +02:00
  • 25d4972ff0 cmake : do not build common library by default when standalone slaren 2024-10-09 12:00:27 +02:00
  • 37e02e34a1
    Added XTC to README MaggotHATE 2024-10-09 14:08:02 +05:00
  • ed535bb2ae
    Merge branch 'ggerganov:master' into master MaggotHATE 2024-10-09 14:00:55 +05:00
  • 6556c90171
    server : update slot->prompt after restore Georgi Gerganov 2024-10-09 09:12:34 +03:00
  • 61a66f25ab
    llama : improve infill support Georgi Gerganov 2024-10-08 14:24:22 +03:00
  • 3dc48fe75a
    examples : remove llama.vim Georgi Gerganov 2024-10-09 10:55:42 +03:00
  • 4bd0c618d1
    Merge pull request #32 from HanClinto/ffmpeg_flag tc-mb 2024-10-09 15:25:08 +08:00
  • 6e0ce3887c removed trailing whitespace OuadiElfarouki 2024-10-08 22:11:13 +01:00
  • 38e6ed46bf Merge branch 'master' into sycl_async_data_load OuadiElfarouki 2024-10-08 22:05:12 +01:00
  • 05420dbba8 fix logging in examples/main/main.cpp Kurt Manucredo 2024-10-08 21:02:50 +00:00
  • b373dcdd46 Restructured ggml-sycl.cpp OuadiElfarouki 2024-10-08 21:52:47 +01:00
  • fbad686918 sycl : Added device and backend reg interfaces OuadiElfarouki 2024-10-08 21:20:43 +01:00
  • d0b1053897
    Fixed incorrect min_keep check MaggotHATE 2024-10-09 00:59:46 +05:00
  • c2c2626ec6
    Added support for SFTTrainer checkpoint models and adapter models containing one or more non-LoRA weights Victor Oluwadare 2024-10-08 20:31:43 +01:00
  • 6feb6b399c
    Update dump info in common MaggotHATE 2024-10-08 21:15:37 +05:00
  • c19fb26042
    Merged back lost commits in common and arg MaggotHATE 2024-10-08 21:11:35 +05:00
  • 09bc6d507c
    Updated info in common and args MaggotHATE 2024-10-08 20:57:36 +05:00
  • 81a0c2603c
    Simplified algorithm and more tests MaggotHATE 2024-10-08 18:38:43 +05:00
  • 8110f783d1
    Merge branch 'ggerganov:master' into master MaggotHATE 2024-10-08 18:36:38 +05:00
  • dca1d4b58a
    ggml : fix BLAS with unsupported types (#9775) b3899 Diego Devesa 2024-10-08 14:21:43 +02:00
  • 458367a906
    server : better security control for public deployments (#9776) b3898 Xuan Son Nguyen 2024-10-08 13:27:04 +02:00
  • 3bea2e6a86 fix tests Xuan Son Nguyen 2024-10-08 11:48:35 +02:00
  • 468551e7a6 fix typo Xuan Son Nguyen 2024-10-08 11:06:53 +02:00
  • 7fee203717 update server docs Xuan Son Nguyen 2024-10-08 11:05:52 +02:00
  • a533aac16e fix tests Xuan Son Nguyen 2024-10-08 11:03:59 +02:00
  • fbefe1731c
    perplexity : keep n_vocab as int and make appropriate casts Georgi Gerganov 2024-10-08 09:40:39 +03:00
  • 22cc760dba
    perplexity : fix integer overflow Georgi Gerganov 2024-10-08 09:13:54 +03:00
  • fa42aa6d89
    scripts : fix spelling typo in messages and comments (#9782) standby24x7 2024-10-08 15:19:53 +09:00
  • b0032007c5 scripts : Fix spelling typo in messages and comments Masanari Iida 2024-10-08 13:10:30 +09:00
  • f64fa24854 mtgpu: enable docker workflow Xiaodong Ye 2024-10-08 09:44:23 +08:00
  • 0c03b923d7 mtgpu: add docker image support Xiaodong Ye 2024-10-01 17:50:33 +08:00
  • c6396aa4bb
    Added support for SFTTrainer checkpoint models and adapter models containing some non-LoRA weights Victor Oluwadare 2024-10-08 02:35:08 +01:00
  • e753f15229 agent: move openapi helpers to their own file ochafik 2024-10-08 01:34:12 +01:00
  • 953bef9374 fix memeory error & clean the print statements root 2024-10-07 21:36:10 +00:00
  • 7560ecb32d protect /props endpoint Xuan Son Nguyen 2024-10-07 23:21:29 +02:00
  • 20ca856ab1 llama : print devices used on model load slaren 2024-10-07 22:45:30 +02:00
  • 5f4e30ddba vulkan : add backend registry / device interfaces slaren 2024-10-04 00:53:34 +02:00
  • 98b204c918
    Merge branch 'ggerganov:master' into master MaggotHATE 2024-10-08 01:20:14 +05:00
  • dbe9ef7783
    Added XTC to test-sampling MaggotHATE 2024-10-08 01:19:39 +05:00
  • d74105f2c7 ggml : rename ggml_internal_get_type_traits -> ggml_get_type_traits slaren 2024-10-07 22:02:16 +02:00
  • e2e10ff199 ggml : do not use BLAS with types without to_float slaren 2024-10-07 22:00:03 +02:00
  • 6374743747
    ggml : add backend registry / device interfaces to BLAS backend (#9752) b3896 Diego Devesa 2024-10-07 21:55:08 +02:00
  • 0986f3faef minor slaren 2024-10-07 21:04:57 +02:00
  • 59ee00a880 fix mmap usage when using host buffers slaren 2024-10-07 20:42:52 +02:00
  • 0f3e091f1d ggml : add backend registry / device interfaces to BLAS backend slaren 2024-10-07 20:39:44 +02:00
  • 460ec6f4f6 server : more explicit endpoint access settings Xuan Son Nguyen 2024-10-07 19:35:11 +02:00
  • f1af42fa8c
    Update building for Android (#9672) b3895 Andrew Minh Nguyen 2024-10-07 09:37:31 -07:00
  • 6279dac039
    flake.lock: Update (#9753) Georgi Gerganov 2024-10-07 19:35:42 +03:00
  • 4c44e3da5a
    Merge branch 'ggerganov:master' into master MaggotHATE 2024-10-07 21:28:09 +05:00
  • f1b746ae97 Merge branch 'master' into sycl_async_data_load OuadiElfarouki 2024-10-07 17:08:40 +01:00
  • d5ac8cf2f2
    ggml : add metal backend registry / device (#9713) Georgi Gerganov 2024-10-07 18:27:51 +03:00
  • 901691c2b5
    metal : remove transfer rate stuff Georgi Gerganov 2024-10-07 18:09:46 +03:00
  • 2294f078cd
    metal : fix maxTransferRate check Georgi Gerganov 2024-10-07 17:16:59 +03:00
  • a70379d941
    Merge remote-tracking branch 'origin/master' into sl/backend-registry-2-add-metal Georgi Gerganov 2024-10-07 16:17:31 +03:00
  • 96b6912103
    metal : single allocation of encode_async block (#9747) b3892 Paul Tsochantaris 2024-10-07 13:26:31 +01:00
  • a8990dbb8d
    [gguf-py] gguf_reader: numpy 2 newbyteorderfix Jett Janiak 2024-10-07 13:10:11 +01:00
  • 9a465199a1 py: Add base_model_sources and dataset_sources to metadata heuristics brian khuu 2024-10-07 22:54:51 +11:00
  • 640039106f py: let users add full base model and dataset to model_card brian khuu 2024-08-06 00:42:27 +10:00
  • 594a07a515
    Update ggml/src/ggml-metal.m Georgi Gerganov 2024-10-07 14:40:28 +03:00
  • 4af03de2a6 Release encode block when re-setting encoding buffer count if needed Paul Tsochantaris 2024-10-07 12:21:28 +01:00