Zhiyuan Li
6a1e977e34
Update ggml/src/ggml-sycl/concat.cpp
...
Co-authored-by: Meng, Hengyu <airdldl@163.com>
2024-11-05 02:41:55 +11:00
Zhiyuan Li
35a1a2dfa9
move element-wise functions outside
2024-11-05 02:40:11 +11:00
Zhiyuan Li
72e4432577
add appropriate asserts
2024-11-05 01:20:52 +11:00
Zhiyuan Li
b81602477b
Update ggml/src/ggml-cpu.c
...
Co-authored-by: Georgi Gerganov <ggerganov@gmail.com>
2024-11-05 01:14:27 +11:00
Zhiyuan Li
a878502f43
fix define error
2024-11-05 01:07:33 +11:00
Zhiyuan Li
81cb301224
update the function to use appropriate types
2024-11-05 00:55:59 +11:00
Zhiyuan Li
bb0685fad5
Update ggml/src/ggml-sycl/wkv6.cpp
...
Co-authored-by: Georgi Gerganov <ggerganov@gmail.com>
2024-11-05 00:42:37 +11:00
Zhiyuan Li
8c7b4ec22a
Update ggml/src/ggml-sycl/outprod.cpp
...
Co-authored-by: Georgi Gerganov <ggerganov@gmail.com>
2024-11-05 00:42:31 +11:00
Zhiyuan Li
9ea34a78cb
fix: add defualt
2024-11-04 23:28:26 +11:00
Zhiyuan Li
61c665b7f1
fix: update changes to upstream
2024-11-04 22:17:12 +11:00
Zhiyuan Li
5f792141c5
Merge branch 'ggerganov:master' into master
2024-11-04 22:12:31 +11:00
Georgi Gerganov
153251f761
sync : ggml
2024-11-04 22:10:53 +11:00
Yuri Khrustalev
eb5711c496
cmake : make it possible linking ggml as external lib (ggml/1003)
2024-11-04 22:10:53 +11:00
Plamen Minev
8050d021ab
metal : fix minor string leaks (ggml/1004)
2024-11-04 22:10:53 +11:00
Diego Devesa
89812b157a
ggml : move CPU backend to a separate file ( #10144 )
2024-11-04 22:10:53 +11:00
Georgi Gerganov
b18963085b
metal : minor fixup in FA kernel ( #10143 )
...
* metal : minor fixup in FA kernel
ggml-ci
* metal : use the unrolled loop variable
* metal : remove unused var
2024-11-04 22:09:57 +11:00
Georgi Gerganov
4d266310f5
flake.lock: Update ( #10146 )
2024-11-04 22:09:57 +11:00
leo-pony
329ed914c9
CANN: adjust backend registry refactor. ( #10158 )
...
remove buffer->iface.get_name that used in cann as it was removed in backend registry refactor PR.
2024-11-04 19:08:22 +08:00
Georgi Gerganov
ce027adfb3
sync : ggml
2024-11-04 10:33:37 +02:00
Yuri Khrustalev
284e5b0275
cmake : make it possible linking ggml as external lib (ggml/1003)
2024-11-04 10:33:11 +02:00
Plamen Minev
e2292aaa17
metal : fix minor string leaks (ggml/1004)
2024-11-04 10:33:10 +02:00
Diego Devesa
9f40989351
ggml : move CPU backend to a separate file ( #10144 )
2024-11-03 19:34:08 +01:00
Georgi Gerganov
08828a6d7d
metal : minor fixup in FA kernel ( #10143 )
...
* metal : minor fixup in FA kernel
ggml-ci
* metal : use the unrolled loop variable
* metal : remove unused var
2024-11-03 15:18:40 +02:00
Georgi Gerganov
1839f69130
flake.lock: Update ( #10146 )
2024-11-03 05:14:15 -08:00
Zhiyuan Li
811aa872d6
wkv6: drop armv9 and tranfer to GGML style
2024-11-03 23:54:57 +11:00
Zhiyuan Li
042c3e0fd3
Merge branch 'ggerganov:master' into master
2024-11-03 17:30:25 +11:00
Zhiyuan Li
1c58096f6f
sycl: Enhance OP support judgment
2024-11-03 16:43:17 +11:00
Zhiyuan Li
bee1cec7d2
sycl: add some ops
2024-11-03 16:43:17 +11:00
Zhiyuan Li
2fc42b6a82
wkv on sycl
2024-11-03 16:43:17 +11:00
Zhiyuan Li
3f75f12114
rwkv6: rename params
2024-11-03 16:43:17 +11:00
Zhiyuan Li
e198f7b9df
rwkv6: update cuda file name
2024-11-03 16:43:17 +11:00
Zhiyuan Li
b4254c5550
rwkv6: support avx2 avx512 armv8 armv9
2024-11-03 16:43:17 +11:00
Zhiyuan Li
f66c75a495
rwkv6: rename to wkv6
2024-11-03 16:43:17 +11:00
Christian Köhnenkamp
9830b6923b
Add apple arm to presets ( #10134 )
...
* Add apple arm to presets
* Add final new line
2024-11-02 15:35:31 -07:00
sasha0552
42cadc74bd
server : fix slot selection by lru ( #10126 )
...
* server : fix slot selection by lru, migrate lcs to `size_t`
* minor debug log fix
2024-11-02 18:34:56 +02:00
Georgi Gerganov
45950415ed
server : fix endpoint checks ( #10135 )
...
ggml-ci
2024-11-02 18:34:00 +02:00
Georgi Gerganov
1926d6e39d
llama : adjust default context size + print warnings ( #10136 )
...
* llama : adjust default context size + print warnings
ggml-ci
* ggml-ci : add missing gpu-layers + adjust context sizes
2024-11-02 15:18:56 +02:00
Diego Devesa
b634f8a26f
simple-chat : only add bos on first prompt ( #10129 )
2024-11-02 13:08:53 +01:00
Xuan Son Nguyen
7554aa4655
convert-lora : make --base
optional ( #10110 )
...
* convert-lora : make `--base` optional
* lint
* handle case where base_model_name_or_path is invalid
* do not include metadata from base model
* clarify unspecified --base
* add small comment [no ci]
* trigger ci
2024-11-02 12:53:17 +01:00
Diego Devesa
a6744e43e8
llama : add simple-chat example ( #10124 )
...
* llama : add simple-chat example
---------
Co-authored-by: Xuan Son Nguyen <thichthat@gmail.com>
2024-11-01 23:50:59 +01:00
Diego Devesa
e991e3127f
llama : use smart pointers for ggml resources ( #10117 )
2024-11-01 23:48:26 +01:00
Shupei Fan
418f5eef26
vulkan : improve ggml_vk_create_buffer error handling ( #9898 )
2024-11-01 19:33:14 +01:00
Georgi Gerganov
ba6f62eb79
readme : update hot topics
2024-11-01 17:31:51 +02:00
sasha0552
d865d1478c
server : fix smart selection of available slot ( #10120 )
...
* Fix smart selection of available slot
* minor fix
* replace vectors of tokens with shorthands
2024-11-01 14:33:14 +01:00
Georgi Gerganov
1804adb0cf
ggml : remove ggml_scratch ( #10121 )
...
ggml-ci
2024-11-01 12:58:45 +02:00
Georgi Gerganov
815fe72adc
sync : ggml
2024-11-01 10:28:24 +02:00
Georgi Gerganov
f221d56220
ggml : alloc ggml_contexts on the heap (whisper/2525)
2024-11-01 10:24:50 +02:00
Zhenwei Jin
e597e50794
build: fix build error in Windows env with OneAPI setup ( #10107 )
2024-11-01 11:09:59 +08:00
Diego Devesa
85679d37f3
llama : improve output buffer type selection ( #10098 )
2024-11-01 00:49:53 +01:00
Diego Devesa
1e9f94994e
quantize : fix --keep-split ( #10114 )
2024-11-01 00:45:34 +01:00