Georgi Gerganov
901691c2b5
metal : remove transfer rate stuff
2024-10-07 18:09:46 +03:00
Georgi Gerganov
2294f078cd
metal : fix maxTransferRate check
2024-10-07 17:16:59 +03:00
Georgi Gerganov
a70379d941
Merge remote-tracking branch 'origin/master' into sl/backend-registry-2-add-metal
2024-10-07 16:17:31 +03:00
Paul Tsochantaris
96b6912103
metal : single allocation of encode_async block ( #9747 )
...
* Single allocation of encode_async block with non-ARC capture in ggml-metal.m
* Moving Block_release to the deallocation code
* Release encode block when re-setting encoding buffer count if needed
* Update ggml/src/ggml-metal.m
---------
Co-authored-by: Georgi Gerganov <ggerganov@gmail.com>
2024-10-07 15:26:31 +03:00
Georgi Gerganov
2bd826de0a
metal : minor [no ci]
2024-10-07 11:50:41 +03:00
Georgi Gerganov
70ff50d753
metal : avoid reference of device context in the backend context
...
ggml-ci
2024-10-07 11:46:34 +03:00
Georgi Gerganov
34e0e6eae4
metal : g_state -> g_ggml_ctx_dev_main [no ci]
2024-10-07 10:52:54 +03:00
Georgi Gerganov
1bd5018c63
metal : minor fix [no ci]
2024-10-07 10:49:25 +03:00
Georgi Gerganov
5f71096e47
metal : avoid unnecessary singleton accesses
...
ggml-ci
2024-10-07 10:47:24 +03:00
slaren
b150ffad41
fix merge
2024-10-07 01:10:35 +02:00
Georgi Gerganov
d5cb86844f
contrib : simplify + minor edits [no ci]
2024-10-06 14:15:27 +03:00
Georgi Gerganov
f4b2dcdf49
readme : fix typo [no ci]
2024-10-06 13:49:41 +03:00
Georgi Gerganov
6dcb899170
metal : fix build when MTLGPUFamilyApple3 is not available
...
ggml-ci
2024-10-06 13:16:18 +03:00
Georgi Gerganov
4b161bc673
metal : fix indent
2024-10-06 13:10:35 +03:00
slaren
5ea66f4354
fixes
2024-10-06 13:09:54 +03:00
Georgi Gerganov
4ef1b017af
llama : adapt to backend changes
...
ggml-ci
2024-10-06 13:09:54 +03:00
Georgi Gerganov
c080e92e75
cont : alternative initialization of global objects
...
ggml-ci
2024-10-06 13:09:54 +03:00
Georgi Gerganov
2e7e05c09b
metal : global registry and device instances
...
ggml-ci
2024-10-06 13:09:54 +03:00
Georgi Gerganov
2d8c2c79ca
metal : fix names [no ci]
2024-10-06 13:09:53 +03:00
Georgi Gerganov
621460063e
ggml : add metal backend registry / device
...
ggml-ci
2024-10-06 13:09:52 +03:00
Georgi Gerganov
b6d6c5289f
sync : llama.cpp
2024-10-06 12:53:28 +03:00
SRHMorris
b0915d5b51
vulkan : retry allocation with fallback flags (whisper/2451)
...
Co-authored-by: Samuel Morris <samuel.morris@artlist.io>
2024-10-06 12:52:11 +03:00
Georgi Gerganov
8c475b97b8
rerank : use [SEP] token instead of [BOS] ( #9737 )
...
* rerank : use [SEP] token instead of [BOS]
ggml-ci
* common : sanity check for non-NULL tokens
ggml-ci
* ci : adjust rank score interval
ggml-ci
* ci : add shebang to run.sh
ggml-ci
2024-10-05 15:55:04 +03:00
Georgi Gerganov
58b16695e1
sync : ggml
2024-10-05 15:53:49 +03:00
Georgi Gerganov
905f5485b2
metal : zero-init buffer contexts (whisper/0)
2024-10-05 15:53:00 +03:00
Viet-Anh NGUYEN (Andrew)
71967c2a6d
Add Llama Assistant ( #9744 )
2024-10-04 20:29:35 +02:00
Georgi Gerganov
17880771ad
sync : ggml
2024-10-04 18:50:25 +03:00
Daniel Bevenius
55951c018d
ggml : fix typo in example usage ggml_gallocr_new (ggml/984)
2024-10-04 18:50:05 +03:00
Diego Devesa
ff565769f2
ggml : fixes after sync (ggml/983)
...
ggml : remove test-backend-buffer
ggml : fix CUDA build warnings
2024-10-04 18:50:04 +03:00
Xuan Son Nguyen
f3fdcfaa79
ci : fine-grant permission ( #9710 )
2024-10-04 11:47:19 +02:00
Daniel Kleine
133c7b46b3
Fixed RNG seed docs ( #9723 )
...
* Update README.md
fixed RNG seed info
* changed print format to unsigned
2024-10-04 10:54:44 +02:00
Georgi Gerganov
d5ed2b929d
metal : remove abort (skip) (ggml/0)
2024-10-03 21:18:19 +03:00
Georgi Gerganov
1bb8a64ebf
sync : ggml
2024-10-03 21:17:49 +03:00
Johannes Gäßler
fabdc3bda3
ggml/ex: calculate accuracy in graph, adapt MNIST (ggml/980)
2024-10-03 21:17:26 +03:00
Johannes Gäßler
eee39bdc96
ggml: refactor cross entropy loss CPU impl. (ggml/976)
2024-10-03 21:17:26 +03:00
Jack Mousseau
5d5ab1e5cc
metal : fix compute pass descriptor autorelease crash ( #9718 )
2024-10-03 21:01:46 +03:00
Diego Devesa
a7ad553513
ggml-backend : add device description to CPU backend ( #9720 )
2024-10-03 17:39:18 +02:00
bandoti
d6fe7abf04
ggml: unify backend logging mechanism ( #9709 )
...
* Add scaffolding for ggml logging macros
* Metal backend now uses GGML logging
* Cuda backend now uses GGML logging
* Cann backend now uses GGML logging
* Add enum tag to parameters
* Use C memory allocation funcs
* Fix compile error
* Use GGML_LOG instead of GGML_PRINT
* Rename llama_state to llama_logger_state
* Prevent null format string
* Fix whitespace
* Remove log callbacks from ggml backends
* Remove cuda log statement
2024-10-03 17:39:03 +02:00
compilade
e3c355ba65
convert : handle tokenizer merges format from transformers 4.45 ( #9696 )
2024-10-03 17:22:15 +03:00
Radoslav Gerganov
841713e1e4
rpc : enable vulkan ( #9714 )
...
closes #8536
2024-10-03 13:00:52 +03:00
Ouadie EL FAROUKI
5639971466
Fixed dequant precision issues in Q4_1 and Q5_1 ( #9711 )
2024-10-03 07:50:44 +01:00
Diego Devesa
c83ad6d01e
ggml-backend : add device and backend reg interfaces ( #9707 )
...
Co-authored-by: Johannes Gäßler <johannesg@5d6.de>
2024-10-03 01:49:47 +02:00
Xuan Son Nguyen
a39ab216aa
llama : reduce compile time and binary size ( #9712 )
...
* llama : speed up compile time
* fix build
* fix build (2)
2024-10-02 15:49:55 +02:00
Alberto Cabrera Pérez
f536f4c439
[SYCL] Initial cmake support of SYCL for AMD GPUs ( #9658 )
...
sycl: initial cmake support of SYCL for AMD GPUs
2024-10-02 13:57:18 +01:00
Radoslav Gerganov
00b7317e63
vulkan : do not use tensor->extra ( #9407 )
...
* vulkan : do not use tensor->extra
This patch allows using the Vulkan backend with the RPC backend as
tensor->extra is no longer used.
Ref: #8536
* Adapt GGML_VULKAN_CHECK_RESULTS to extra removal (#2 )
---------
Co-authored-by: 0cc4m <picard12@live.de>
2024-10-02 13:49:16 +03:00
Zhenwei Jin
76b37d1541
gguf-split : improve --split and --merge logic ( #9619 )
...
* make sure params --split and --merge are not specified at same time
* update gguf-split params parse logic
* Update examples/gguf-split/gguf-split.cpp
Co-authored-by: slaren <slarengh@gmail.com>
---------
Co-authored-by: Xuan Son Nguyen <thichthat@gmail.com>
Co-authored-by: slaren <slarengh@gmail.com>
2024-10-02 10:21:57 +03:00
Georgi Gerganov
148844fe97
examples : remove benchmark ( #9704 )
...
ggml-ci
2024-10-02 10:14:44 +03:00
Paweł Wodnicki
3f1ae2e32c
Update README.md ( #9591 )
...
Add Bielik model.
2024-10-01 19:18:46 +02:00
Georgi Gerganov
f1b8c42711
sync : ggml
2024-10-01 16:09:42 +03:00
Johannes Gäßler
e98c1c188e
test: fix OPT_STEP_ADAMW for test-backend-ops (ggml/974)
2024-10-01 16:07:40 +03:00