standby24x7
fa42aa6d89
scripts : fix spelling typo in messages and comments ( #9782 )
...
Signed-off-by: Masanari Iida <standby24x7@gmail.com>
2024-10-08 09:19:53 +03:00
MaggotHATE
98b204c918
Merge branch 'ggerganov:master' into master
2024-10-08 01:20:14 +05:00
MaggotHATE
dbe9ef7783
Added XTC to test-sampling
2024-10-08 01:19:39 +05:00
Diego Devesa
6374743747
ggml : add backend registry / device interfaces to BLAS backend ( #9752 )
...
* ggml : add backend registry / device interfaces to BLAS backend
* fix mmap usage when using host buffers
2024-10-07 21:55:08 +02:00
Andrew Minh Nguyen
f1af42fa8c
Update building for Android ( #9672 )
...
* docs : clarify building Android on Termux
* docs : update building Android on Termux
* docs : add cross-compiling for Android
* cmake : link dl explicitly for Android
2024-10-07 09:37:31 -07:00
Georgi Gerganov
6279dac039
flake.lock: Update ( #9753 )
...
Flake lock file updates:
• Updated input 'flake-parts':
'github:hercules-ci/flake-parts/bcef6817a8b2aa20a5a6dbb19b43e63c5bf8619a?narHash=sha256-HO4zgY0ekfwO5bX0QH/3kJ/h4KvUDFZg8YpkNwIbg1U%3D' (2024-09-12)
→ 'github:hercules-ci/flake-parts/3d04084d54bedc3d6b8b736c70ef449225c361b1?narHash=sha256-K5ZLCyfO/Zj9mPFldf3iwS6oZStJcU4tSpiXTMYaaL0%3D' (2024-10-01)
• Updated input 'flake-parts/nixpkgs-lib':
'https://github.com/NixOS/nixpkgs/archive/356624c12086a18f2ea2825fed34523d60ccc4e3.tar.gz?narHash=sha256-Ss8QWLXdr2JCBPcYChJhz4xJm%2Bh/xjl4G0c0XlP6a74%3D ' (2024-09-01)
→ 'https://github.com/NixOS/nixpkgs/archive/fb192fec7cc7a4c26d51779e9bab07ce6fa5597a.tar.gz?narHash=sha256-0xHYkMkeLVQAMa7gvkddbPqpxph%2BhDzdu1XdGPJR%2BOs%3D ' (2024-10-01)
• Updated input 'nixpkgs':
'github:NixOS/nixpkgs/1925c603f17fc89f4c8f6bf6f631a802ad85d784?narHash=sha256-J%2BPeFKSDV%2BpHL7ukkfpVzCOO7mBSrrpJ3svwBFABbhI%3D' (2024-09-26)
→ 'github:NixOS/nixpkgs/bc947f541ae55e999ffdb4013441347d83b00feb?narHash=sha256-NOiTvBbRLIOe5F6RbHaAh6%2B%2BBNjsb149fGZd1T4%2BKBg%3D' (2024-10-04)
Co-authored-by: github-actions[bot] <github-actions[bot]@users.noreply.github.com>
2024-10-07 09:35:42 -07:00
MaggotHATE
4c44e3da5a
Merge branch 'ggerganov:master' into master
2024-10-07 21:28:09 +05:00
Georgi Gerganov
d5ac8cf2f2
ggml : add metal backend registry / device ( #9713 )
...
* ggml : add metal backend registry / device
ggml-ci
* metal : fix names [no ci]
* metal : global registry and device instances
ggml-ci
* cont : alternative initialization of global objects
ggml-ci
* llama : adapt to backend changes
ggml-ci
* fixes
* metal : fix indent
* metal : fix build when MTLGPUFamilyApple3 is not available
ggml-ci
* fix merge
* metal : avoid unnecessary singleton accesses
ggml-ci
* metal : minor fix [no ci]
* metal : g_state -> g_ggml_ctx_dev_main [no ci]
* metal : avoid reference of device context in the backend context
ggml-ci
* metal : minor [no ci]
* metal : fix maxTransferRate check
* metal : remove transfer rate stuff
---------
Co-authored-by: slaren <slarengh@gmail.com>
2024-10-07 18:27:51 +03:00
Paul Tsochantaris
96b6912103
metal : single allocation of encode_async block ( #9747 )
...
* Single allocation of encode_async block with non-ARC capture in ggml-metal.m
* Moving Block_release to the deallocation code
* Release encode block when re-setting encoding buffer count if needed
* Update ggml/src/ggml-metal.m
---------
Co-authored-by: Georgi Gerganov <ggerganov@gmail.com>
2024-10-07 15:26:31 +03:00
Georgi Gerganov
d5cb86844f
contrib : simplify + minor edits [no ci]
2024-10-06 14:15:27 +03:00
MaggotHATE
39940e5fa3
Algorithm rework
...
1. Scan token from top till the first non-penalizable
2. Remove the last captured token (the least probable above threshold)
3. Shift all tokens to override the remaining penalizable
4. Penalize and put them at the the bottom.
2024-10-06 16:15:12 +05:00
MaggotHATE
094caea359
Merge branch 'ggerganov:master' into master
2024-10-06 16:06:32 +05:00
Georgi Gerganov
f4b2dcdf49
readme : fix typo [no ci]
2024-10-06 13:49:41 +03:00
Georgi Gerganov
b6d6c5289f
sync : llama.cpp
2024-10-06 12:53:28 +03:00
SRHMorris
b0915d5b51
vulkan : retry allocation with fallback flags (whisper/2451)
...
Co-authored-by: Samuel Morris <samuel.morris@artlist.io>
2024-10-06 12:52:11 +03:00
MaggotHATE
63e60deda3
Swapped sorting for a custom algorithm
...
Shifts tokens to remove the penalized ones, then puts the penalized at the back. Should make `min_keep` still viable.
2024-10-05 23:27:36 +05:00
MaggotHATE
59e8e63e68
Merge branch 'ggerganov:master' into master
2024-10-05 21:51:52 +05:00
Georgi Gerganov
8c475b97b8
rerank : use [SEP] token instead of [BOS] ( #9737 )
...
* rerank : use [SEP] token instead of [BOS]
ggml-ci
* common : sanity check for non-NULL tokens
ggml-ci
* ci : adjust rank score interval
ggml-ci
* ci : add shebang to run.sh
ggml-ci
2024-10-05 15:55:04 +03:00
Georgi Gerganov
58b16695e1
sync : ggml
2024-10-05 15:53:49 +03:00
Georgi Gerganov
905f5485b2
metal : zero-init buffer contexts (whisper/0)
2024-10-05 15:53:00 +03:00
MaggotHATE
74f657cc24
Fixed broken randomization
...
Thanks to @slaren for explanation
2024-10-04 23:47:19 +05:00
MaggotHATE
899e0732ee
Merge branch 'ggerganov:master' into master
2024-10-04 23:46:03 +05:00
MaggotHATE
49cd2118e0
Moved min_keep
...
Moved from conditions to a simple check at the end.
2024-10-04 23:35:47 +05:00
Viet-Anh NGUYEN (Andrew)
71967c2a6d
Add Llama Assistant ( #9744 )
2024-10-04 20:29:35 +02:00
MaggotHATE
6d94ba2e58
Fixed forgotten header
2024-10-04 22:51:04 +05:00
MaggotHATE
4f8e55b170
Fixed RNG to be reproduceable
...
Thanks to @slaren for directions
2024-10-04 22:38:12 +05:00
MaggotHATE
f2a2a618a2
Fixed trailing backspaces
2024-10-04 21:42:54 +05:00
MaggotHATE
d9c9203a0b
Merge branch 'ggerganov:master' into master
2024-10-04 21:35:23 +05:00
MaggotHATE
41e16654bd
First fixes by comments
...
Still need to look into sorting
2024-10-04 21:34:31 +05:00
Georgi Gerganov
17880771ad
sync : ggml
2024-10-04 18:50:25 +03:00
Daniel Bevenius
55951c018d
ggml : fix typo in example usage ggml_gallocr_new (ggml/984)
2024-10-04 18:50:05 +03:00
Diego Devesa
ff565769f2
ggml : fixes after sync (ggml/983)
...
ggml : remove test-backend-buffer
ggml : fix CUDA build warnings
2024-10-04 18:50:04 +03:00
MaggotHATE
db54ac5df4
Simplified chances calculation
...
To be more inline with the original implementation, chance is calculated once at the beginning.
2024-10-04 18:30:46 +05:00
MaggotHATE
9455194056
Cleanup
2024-10-04 17:53:13 +05:00
MaggotHATE
89640b00a1
Initial XTC commit
...
Adds XTC sampler, not activated by default, but recommended settings by default.
2024-10-04 17:51:27 +05:00
Xuan Son Nguyen
f3fdcfaa79
ci : fine-grant permission ( #9710 )
2024-10-04 11:47:19 +02:00
Daniel Kleine
133c7b46b3
Fixed RNG seed docs ( #9723 )
...
* Update README.md
fixed RNG seed info
* changed print format to unsigned
2024-10-04 10:54:44 +02:00
Georgi Gerganov
d5ed2b929d
metal : remove abort (skip) (ggml/0)
2024-10-03 21:18:19 +03:00
Georgi Gerganov
1bb8a64ebf
sync : ggml
2024-10-03 21:17:49 +03:00
Johannes Gäßler
fabdc3bda3
ggml/ex: calculate accuracy in graph, adapt MNIST (ggml/980)
2024-10-03 21:17:26 +03:00
Johannes Gäßler
eee39bdc96
ggml: refactor cross entropy loss CPU impl. (ggml/976)
2024-10-03 21:17:26 +03:00
Jack Mousseau
5d5ab1e5cc
metal : fix compute pass descriptor autorelease crash ( #9718 )
2024-10-03 21:01:46 +03:00
Diego Devesa
a7ad553513
ggml-backend : add device description to CPU backend ( #9720 )
2024-10-03 17:39:18 +02:00
bandoti
d6fe7abf04
ggml: unify backend logging mechanism ( #9709 )
...
* Add scaffolding for ggml logging macros
* Metal backend now uses GGML logging
* Cuda backend now uses GGML logging
* Cann backend now uses GGML logging
* Add enum tag to parameters
* Use C memory allocation funcs
* Fix compile error
* Use GGML_LOG instead of GGML_PRINT
* Rename llama_state to llama_logger_state
* Prevent null format string
* Fix whitespace
* Remove log callbacks from ggml backends
* Remove cuda log statement
2024-10-03 17:39:03 +02:00
compilade
e3c355ba65
convert : handle tokenizer merges format from transformers 4.45 ( #9696 )
2024-10-03 17:22:15 +03:00
Radoslav Gerganov
841713e1e4
rpc : enable vulkan ( #9714 )
...
closes #8536
2024-10-03 13:00:52 +03:00
Ouadie EL FAROUKI
5639971466
Fixed dequant precision issues in Q4_1 and Q5_1 ( #9711 )
2024-10-03 07:50:44 +01:00
Diego Devesa
c83ad6d01e
ggml-backend : add device and backend reg interfaces ( #9707 )
...
Co-authored-by: Johannes Gäßler <johannesg@5d6.de>
2024-10-03 01:49:47 +02:00
Xuan Son Nguyen
a39ab216aa
llama : reduce compile time and binary size ( #9712 )
...
* llama : speed up compile time
* fix build
* fix build (2)
2024-10-02 15:49:55 +02:00
Alberto Cabrera Pérez
f536f4c439
[SYCL] Initial cmake support of SYCL for AMD GPUs ( #9658 )
...
sycl: initial cmake support of SYCL for AMD GPUs
2024-10-02 13:57:18 +01:00