Commit graph

766 commits

Author SHA1 Message Date
niansa
77ebe46966 Fixed case order in ggml_vk_graph_compute 2023-07-05 14:21:16 +02:00
niansa
856b7589e9 Optimized ggml_vk_mul_mat_f16 argument count 2023-07-05 13:34:01 +02:00
niansa
6be93e6071 Ported mat mul from Metal 2023-07-05 13:28:40 +02:00
niansa
2fc8249ba3 Simple mul_mat_f16 for speed and removal of unused mul_mat_f32 2023-07-05 10:59:38 +02:00
niansa
f0e1429d7f Implemented RMS_NORM 2023-06-30 16:01:08 +02:00
niansa
d1f84db4b6 Implemented GGML_OP_NORM 2023-06-30 15:18:10 +02:00
niansa
8fa60134b1 Added missing break to mul_mat_f16 case 2023-06-30 12:47:17 +02:00
niansa
0dc5f2f2ba Fixed mul mat dispatch size 2023-06-30 12:31:13 +02:00
niansa
f093bf2e5e Minor MUL_MAT fix and implemented DIAG_MASK_INF 2023-06-30 12:19:29 +02:00
niansa
964fe8c546 Added mul_mat (needs fixes) 2023-06-30 11:47:10 +02:00
niansa
749d6179a8 Snake case all functions 2023-06-29 14:23:00 +02:00
niansa
5ac68ccacb Cleanups 2023-06-29 11:14:21 +02:00
niansa
de7d1823ed Implemented ggml_vk_soft_max 2023-06-28 12:48:41 +02:00
niansa
e2b721db65 Allow vk add row 2023-06-28 10:19:18 +02:00
niansa
ed14f0764a Fixed ggml_vk_abmath row argument 2023-06-28 10:15:23 +02:00
niansa
072007b1e8 Add buffer qualifiers 2023-06-23 21:21:16 +02:00
niansa
acb7d90398 Reenabled unknown op message 2023-06-23 20:39:32 +02:00
niansa
5d5f66d1d9 More little fixes and stuff 2023-06-23 20:37:58 +02:00
niansa
e0814f86a2 Free vk context 2023-06-23 20:02:46 +02:00
niansa
55815b67f4 Improved memory safety 2023-06-23 19:58:41 +02:00
niansa
4b267e88b6 Temporarily care for all layers 2023-06-23 18:40:58 +02:00
niansa
40621ea0ec Added more debugging 2023-06-23 18:26:21 +02:00
niansa
e6da9bd96b Added ggml_vk_mem_used() 2023-06-23 17:57:09 +02:00
niansa
1a68195408 Add mutexes for gpu tensors 2023-06-23 17:46:09 +02:00
niansa
46f577bfc1 h2d tensors during loadup 2023-06-23 17:10:45 +02:00
niansa
98e588c6eb Fix ggml_vk_h2d_tensor throwing on second call 2023-06-23 16:50:37 +02:00
niansa
09b0b3a49b Wait for all threads to finish 2023-06-23 16:13:32 +02:00
niansa
2589cb0c70 Prevent compileSource race 2023-06-23 16:02:49 +02:00
niansa
5c0d8dd0f2 Specify program output size 2023-06-23 15:58:13 +02:00
niansa
e830264c92 Share sequence to functions and add scale() 2023-06-23 15:10:24 +02:00
niansa
5e9403342b Minor fixes 2023-06-23 15:01:09 +02:00
niansa
b6264542b7 Added vk_mul to ggml_vk_graph_compute 2023-06-23 14:19:31 +02:00
niansa
18d6f7f8da More progress... 2023-06-23 14:08:45 +02:00
niansa
d539247996 Began implementing ggml_graph_compute 2023-06-23 14:03:33 +02:00
niansa
b8a4594f89 More fixes... 2023-06-23 12:19:33 +02:00
niansa
9d643755a6 Fixed compile error 2023-06-23 11:51:25 +02:00
niansa
339bc36cdd Added more functions from Metal 2023-06-23 11:50:30 +02:00
niansa
9cdaea9240 Implemented dequantize_row_q4_1 2023-06-22 16:30:36 +02:00
niansa
b0f11fa9c1 More code cleanups 2023-06-22 16:05:56 +02:00
niansa
3b3d30e4ad Cleanups 2023-06-22 13:57:04 +02:00
niansa
2f3fe0c0a4 Updated gitignore 2023-06-22 13:57:04 +02:00
niansa
4f598dd973 Initial working stuff 2023-06-22 13:57:04 +02:00
Johannes Gäßler
bbca06e269
cmake: revert CUDA arch default to 52, 61 if f16 (#1959) 2023-06-21 23:49:25 +02:00
Rahul Vivek Nair
fb98254f99
Fix typo in README.md (#1961) 2023-06-21 23:48:43 +02:00
Georgi Gerganov
049aa16b8c
readme : add link to p1 2023-06-20 19:05:54 +03:00
Xiake Sun
2322ec223a
Fix typo (#1949) 2023-06-20 15:42:40 +03:00
Ettore Di Giacinto
aacdbd4056
llama : fix params struct slignment (#1936)
* Workaround struct misalignment during value-copy

Signed-off-by: mudler <mudler@localai.io>

* Move booleans at the bottom of the structure

Signed-off-by: mudler <mudler@localai.io>

* Add comment

Signed-off-by: mudler <mudler@localai.io>

---------

Signed-off-by: mudler <mudler@localai.io>
2023-06-20 04:24:39 +03:00
Henri Vasserman
20568fe60f
[Fix] Reenable server embedding endpoint (#1937)
* Add back embedding feature

* Update README
2023-06-20 01:12:39 +03:00
Georgi Gerganov
18b35625c3
ggml : fix bug in LBFGS optimizer (found by ggml tests) 2023-06-19 20:43:30 +03:00
l3utterfly
ba4e85a833
llama : use aligned memory during ggml_init call from loading saved sessions (#1934)
* fixed issue: memory is not guaranteed to be aligned properly during ggml_init call from loading saved sessions

* - removed commented out old code from fix
- updated another instance of same issue below original
2023-06-19 18:20:06 +03:00