MaggotHATE
882a603bda
Merge branch 'master' into master
2024-10-11 11:26:05 +05:00
Diego Devesa
7eee341bee
common : use common_ prefix for common library functions ( #9805 )
...
* common : use common_ prefix for common library functions
---------
Co-authored-by: Georgi Gerganov <ggerganov@gmail.com>
2024-10-10 22:57:42 +02:00
Diego Devesa
0e9f760eb1
rpc : add backend registry / device interfaces ( #9812 )
...
* rpc : add backend registry / device interfaces
* llama : add llama_supports_rpc API
* ggml_backend_rpc_start_rpc_server -> ggml_backend_rpc_start_server
2024-10-10 20:14:55 +02:00
R0CKSTAR
cf8e0a3bb9
musa: add docker image support ( #9685 )
...
* mtgpu: add docker image support
Signed-off-by: Xiaodong Ye <xiaodong.ye@mthreads.com>
* mtgpu: enable docker workflow
Signed-off-by: Xiaodong Ye <xiaodong.ye@mthreads.com>
---------
Signed-off-by: Xiaodong Ye <xiaodong.ye@mthreads.com>
2024-10-10 20:10:37 +02:00
MaggotHATE
72db625bd4
Added XTC to server UIs
2024-10-10 22:59:23 +05:00
Diego Devesa
c7499c557c
examples : do not use common library in simple example ( #9803 )
...
* examples : do not use common library in simple example
* add command line parser, simplify code
2024-10-10 19:50:49 +02:00
MaggotHATE
f7a383ffb3
Initial server support
2024-10-10 21:48:49 +05:00
MaggotHATE
2107882cf5
Renamed parameters, fixed info and defaults
...
* probability is at 0 by default, but XTC is included in sampling queue
* threshold higher than 0.5 switches XTC off
2024-10-10 19:35:28 +05:00
MaggotHATE
ba29d31fb7
Merge branch 'ggerganov:master' into master
2024-10-10 11:42:50 +05:00
Diego Devesa
c81f3bbb05
cmake : do not build common library by default when standalone ( #9804 )
2024-10-09 18:49:52 +02:00
Georgi Gerganov
e7022064ab
perplexity : fix integer overflow ( #9783 )
...
* perplexity : fix integer overflow
ggml-ci
* perplexity : keep n_vocab as int and make appropriate casts
ggml-ci
2024-10-09 17:00:18 +03:00
MaggotHATE
37e02e34a1
Added XTC to README
2024-10-09 14:08:02 +05:00
MaggotHATE
ed535bb2ae
Merge branch 'ggerganov:master' into master
2024-10-09 14:00:55 +05:00
Georgi Gerganov
3dc48fe75a
examples : remove llama.vim
...
An updated version will be added in #9787
2024-10-09 10:55:42 +03:00
MaggotHATE
d0b1053897
Fixed incorrect min_keep check
2024-10-09 00:59:46 +05:00
MaggotHATE
6feb6b399c
Update dump info in common
2024-10-08 21:15:37 +05:00
MaggotHATE
c19fb26042
Merged back lost commits in common and arg
2024-10-08 21:11:35 +05:00
MaggotHATE
09bc6d507c
Updated info in common and args
2024-10-08 20:57:36 +05:00
MaggotHATE
81a0c2603c
Simplified algorithm and more tests
2024-10-08 18:38:43 +05:00
MaggotHATE
8110f783d1
Merge branch 'ggerganov:master' into master
2024-10-08 18:36:38 +05:00
Diego Devesa
dca1d4b58a
ggml : fix BLAS with unsupported types ( #9775 )
...
* ggml : do not use BLAS with types without to_float
* ggml : return pointer from ggml_internal_get_type_traits to avoid unnecessary copies
* ggml : rename ggml_internal_get_type_traits -> ggml_get_type_traits
it's not really internal if everybody uses it
2024-10-08 14:21:43 +02:00
Xuan Son Nguyen
458367a906
server : better security control for public deployments ( #9776 )
...
* server : more explicit endpoint access settings
* protect /props endpoint
* fix tests
* update server docs
* fix typo
* fix tests
2024-10-08 13:27:04 +02:00
standby24x7
fa42aa6d89
scripts : fix spelling typo in messages and comments ( #9782 )
...
Signed-off-by: Masanari Iida <standby24x7@gmail.com>
2024-10-08 09:19:53 +03:00
MaggotHATE
98b204c918
Merge branch 'ggerganov:master' into master
2024-10-08 01:20:14 +05:00
MaggotHATE
dbe9ef7783
Added XTC to test-sampling
2024-10-08 01:19:39 +05:00
Diego Devesa
6374743747
ggml : add backend registry / device interfaces to BLAS backend ( #9752 )
...
* ggml : add backend registry / device interfaces to BLAS backend
* fix mmap usage when using host buffers
2024-10-07 21:55:08 +02:00
Andrew Minh Nguyen
f1af42fa8c
Update building for Android ( #9672 )
...
* docs : clarify building Android on Termux
* docs : update building Android on Termux
* docs : add cross-compiling for Android
* cmake : link dl explicitly for Android
2024-10-07 09:37:31 -07:00
Georgi Gerganov
6279dac039
flake.lock: Update ( #9753 )
...
Flake lock file updates:
• Updated input 'flake-parts':
'github:hercules-ci/flake-parts/bcef6817a8b2aa20a5a6dbb19b43e63c5bf8619a?narHash=sha256-HO4zgY0ekfwO5bX0QH/3kJ/h4KvUDFZg8YpkNwIbg1U%3D' (2024-09-12)
→ 'github:hercules-ci/flake-parts/3d04084d54bedc3d6b8b736c70ef449225c361b1?narHash=sha256-K5ZLCyfO/Zj9mPFldf3iwS6oZStJcU4tSpiXTMYaaL0%3D' (2024-10-01)
• Updated input 'flake-parts/nixpkgs-lib':
'https://github.com/NixOS/nixpkgs/archive/356624c12086a18f2ea2825fed34523d60ccc4e3.tar.gz?narHash=sha256-Ss8QWLXdr2JCBPcYChJhz4xJm%2Bh/xjl4G0c0XlP6a74%3D ' (2024-09-01)
→ 'https://github.com/NixOS/nixpkgs/archive/fb192fec7cc7a4c26d51779e9bab07ce6fa5597a.tar.gz?narHash=sha256-0xHYkMkeLVQAMa7gvkddbPqpxph%2BhDzdu1XdGPJR%2BOs%3D ' (2024-10-01)
• Updated input 'nixpkgs':
'github:NixOS/nixpkgs/1925c603f17fc89f4c8f6bf6f631a802ad85d784?narHash=sha256-J%2BPeFKSDV%2BpHL7ukkfpVzCOO7mBSrrpJ3svwBFABbhI%3D' (2024-09-26)
→ 'github:NixOS/nixpkgs/bc947f541ae55e999ffdb4013441347d83b00feb?narHash=sha256-NOiTvBbRLIOe5F6RbHaAh6%2B%2BBNjsb149fGZd1T4%2BKBg%3D' (2024-10-04)
Co-authored-by: github-actions[bot] <github-actions[bot]@users.noreply.github.com>
2024-10-07 09:35:42 -07:00
MaggotHATE
4c44e3da5a
Merge branch 'ggerganov:master' into master
2024-10-07 21:28:09 +05:00
Georgi Gerganov
d5ac8cf2f2
ggml : add metal backend registry / device ( #9713 )
...
* ggml : add metal backend registry / device
ggml-ci
* metal : fix names [no ci]
* metal : global registry and device instances
ggml-ci
* cont : alternative initialization of global objects
ggml-ci
* llama : adapt to backend changes
ggml-ci
* fixes
* metal : fix indent
* metal : fix build when MTLGPUFamilyApple3 is not available
ggml-ci
* fix merge
* metal : avoid unnecessary singleton accesses
ggml-ci
* metal : minor fix [no ci]
* metal : g_state -> g_ggml_ctx_dev_main [no ci]
* metal : avoid reference of device context in the backend context
ggml-ci
* metal : minor [no ci]
* metal : fix maxTransferRate check
* metal : remove transfer rate stuff
---------
Co-authored-by: slaren <slarengh@gmail.com>
2024-10-07 18:27:51 +03:00
Paul Tsochantaris
96b6912103
metal : single allocation of encode_async block ( #9747 )
...
* Single allocation of encode_async block with non-ARC capture in ggml-metal.m
* Moving Block_release to the deallocation code
* Release encode block when re-setting encoding buffer count if needed
* Update ggml/src/ggml-metal.m
---------
Co-authored-by: Georgi Gerganov <ggerganov@gmail.com>
2024-10-07 15:26:31 +03:00
Georgi Gerganov
d5cb86844f
contrib : simplify + minor edits [no ci]
2024-10-06 14:15:27 +03:00
MaggotHATE
39940e5fa3
Algorithm rework
...
1. Scan token from top till the first non-penalizable
2. Remove the last captured token (the least probable above threshold)
3. Shift all tokens to override the remaining penalizable
4. Penalize and put them at the the bottom.
2024-10-06 16:15:12 +05:00
MaggotHATE
094caea359
Merge branch 'ggerganov:master' into master
2024-10-06 16:06:32 +05:00
Georgi Gerganov
f4b2dcdf49
readme : fix typo [no ci]
2024-10-06 13:49:41 +03:00
Georgi Gerganov
b6d6c5289f
sync : llama.cpp
2024-10-06 12:53:28 +03:00
SRHMorris
b0915d5b51
vulkan : retry allocation with fallback flags (whisper/2451)
...
Co-authored-by: Samuel Morris <samuel.morris@artlist.io>
2024-10-06 12:52:11 +03:00
MaggotHATE
63e60deda3
Swapped sorting for a custom algorithm
...
Shifts tokens to remove the penalized ones, then puts the penalized at the back. Should make `min_keep` still viable.
2024-10-05 23:27:36 +05:00
MaggotHATE
59e8e63e68
Merge branch 'ggerganov:master' into master
2024-10-05 21:51:52 +05:00
Georgi Gerganov
8c475b97b8
rerank : use [SEP] token instead of [BOS] ( #9737 )
...
* rerank : use [SEP] token instead of [BOS]
ggml-ci
* common : sanity check for non-NULL tokens
ggml-ci
* ci : adjust rank score interval
ggml-ci
* ci : add shebang to run.sh
ggml-ci
2024-10-05 15:55:04 +03:00
Georgi Gerganov
58b16695e1
sync : ggml
2024-10-05 15:53:49 +03:00
Georgi Gerganov
905f5485b2
metal : zero-init buffer contexts (whisper/0)
2024-10-05 15:53:00 +03:00
MaggotHATE
74f657cc24
Fixed broken randomization
...
Thanks to @slaren for explanation
2024-10-04 23:47:19 +05:00
MaggotHATE
899e0732ee
Merge branch 'ggerganov:master' into master
2024-10-04 23:46:03 +05:00
MaggotHATE
49cd2118e0
Moved min_keep
...
Moved from conditions to a simple check at the end.
2024-10-04 23:35:47 +05:00
Viet-Anh NGUYEN (Andrew)
71967c2a6d
Add Llama Assistant ( #9744 )
2024-10-04 20:29:35 +02:00
MaggotHATE
6d94ba2e58
Fixed forgotten header
2024-10-04 22:51:04 +05:00
MaggotHATE
4f8e55b170
Fixed RNG to be reproduceable
...
Thanks to @slaren for directions
2024-10-04 22:38:12 +05:00
MaggotHATE
f2a2a618a2
Fixed trailing backspaces
2024-10-04 21:42:54 +05:00
MaggotHATE
d9c9203a0b
Merge branch 'ggerganov:master' into master
2024-10-04 21:35:23 +05:00