Georgi Gerganov
|
b252acbcb6
|
metal : add comments
|
2023-06-04 18:10:28 +03:00 |
|
Georgi Gerganov
|
d8a7486d17
|
Revert "ci : disable temporary"
This reverts commit 98c267fc77 .
|
2023-06-04 17:58:23 +03:00 |
|
Georgi Gerganov
|
a7fb899c53
|
metal : final refactoring and simplification
|
2023-06-04 17:57:02 +03:00 |
|
Georgi Gerganov
|
e26cd6b483
|
mtl : remove temp / debug code
|
2023-06-04 11:23:36 +03:00 |
|
Georgi Gerganov
|
e4b522232c
|
mtl : clean-up ggml mtl interface + suport scratch / inplace
|
2023-06-04 11:13:40 +03:00 |
|
Georgi Gerganov
|
18e482a89c
|
mtl : preparing for merge
|
2023-06-04 09:27:27 +03:00 |
|
Georgi Gerganov
|
4df2ef3161
|
mtl : make it work with main example
Lots of hacks but at least now it generates text
|
2023-06-03 09:31:33 +03:00 |
|
Georgi Gerganov
|
2f4e9d19cc
|
mtl : plug Metal inference into llama.cpp (very quick-n-dirty)
|
2023-06-02 22:45:34 +03:00 |
|
Georgi Gerganov
|
640a889632
|
mtl : add save/load vocab to ggml file
|
2023-06-02 21:00:30 +03:00 |
|
Georgi Gerganov
|
03c2d72867
|
mtl : simplify implementation
|
2023-06-02 20:36:26 +03:00 |
|
Georgi Gerganov
|
627605732c
|
mtl : remove printfs from inner loop
|
2023-06-02 19:58:08 +03:00 |
|
Georgi Gerganov
|
b088e14a7e
|
mtl : more threads for rms_norm + better timing
|
2023-06-02 19:26:58 +03:00 |
|
Georgi Gerganov
|
70c3387726
|
mtl : fix kernel signature + roll inner loop
|
2023-06-02 19:11:39 +03:00 |
|
Georgi Gerganov
|
847bbfe9e6
|
mtl : faster mul_mat_q4_0_f32 kernel
|
2023-06-02 18:40:25 +03:00 |
|
Georgi Gerganov
|
33671460b0
|
mtl : fix bug in f16 x f32 mul mat + speed-up computation
|
2023-06-02 18:23:51 +03:00 |
|
Georgi Gerganov
|
e55f7b0bdb
|
mtl : add f16 mat x f32 vec multiplication kernel
|
2023-06-01 23:37:49 +03:00 |
|
Georgi Gerganov
|
f0196a7e7a
|
mtl : optimize rms_norm and soft_max kernels
|
2023-06-01 22:51:42 +03:00 |
|
Georgi Gerganov
|
9665429e94
|
mtl : full GPU inference of the computation graph
|
2023-06-01 21:50:01 +03:00 |
|
Georgi Gerganov
|
fbd3f6258d
|
mtl : add non-broadcast mul kernel
|
2023-06-01 21:40:53 +03:00 |
|
Georgi Gerganov
|
42dca4004c
|
mtl : add silu kernel
|
2023-06-01 21:35:11 +03:00 |
|
Georgi Gerganov
|
a0cc3de59a
|
mtl : add f32 -> f32 cpy kernel
|
2023-06-01 21:30:33 +03:00 |
|
Georgi Gerganov
|
a266c26de2
|
mtl : verify V tensor contents
|
2023-06-01 21:27:24 +03:00 |
|
Georgi Gerganov
|
f67c2d8cab
|
ggml : update ggml_nbytes() to handle non-contiguous tensors
|
2023-06-01 21:27:03 +03:00 |
|
Georgi Gerganov
|
17930fbcb7
|
mtl : fix soft_max kernel
|
2023-06-01 20:48:24 +03:00 |
|
Georgi Gerganov
|
17a70362a6
|
mtl : add diag_mask_inf kernel
|
2023-06-01 20:41:54 +03:00 |
|
Georgi Gerganov
|
0f1c580860
|
mtl : add scale kernel
|
2023-06-01 19:52:32 +03:00 |
|
Georgi Gerganov
|
51efb59437
|
mtl : confirm f16 x f32 attention mul mat
|
2023-06-01 19:45:36 +03:00 |
|
Georgi Gerganov
|
948fcfde7e
|
mtl : add cpy kernel + handle view ops
|
2023-06-01 19:21:28 +03:00 |
|
Georgi Gerganov
|
94ea9e7bfe
|
ggml : store offset as opt arg for ggml_view_xd() operators
|
2023-06-01 19:21:08 +03:00 |
|
Georgi Gerganov
|
7ca81e9e65
|
mtl : add reshape and transpose handling
|
2023-05-31 23:01:37 +03:00 |
|
Georgi Gerganov
|
1213af76ce
|
mtl : add rope kernel
|
2023-05-31 22:28:59 +03:00 |
|
Georgi Gerganov
|
6af6a05663
|
ggml : fix handling of "view" ops in ggml_graph_import()
|
2023-05-31 22:28:15 +03:00 |
|
Georgi Gerganov
|
b2fd06c6aa
|
mtl : working mul_mat q4
|
2023-05-30 23:06:49 +03:00 |
|
Georgi Gerganov
|
29bec00ba0
|
mtl : another mul_mat Q4 (still does not work)
|
2023-05-30 22:31:07 +03:00 |
|
Georgi Gerganov
|
96d005225f
|
mtl : mul_mat fixes (still wrong)
|
2023-05-30 22:20:17 +03:00 |
|
Georgi Gerganov
|
2a24994bad
|
mtl : initial mul_mat Q4 kernel (wrong results)
|
2023-05-30 22:02:54 +03:00 |
|
Georgi Gerganov
|
64afc0b53a
|
mtl : add mul kernel + confirm working
|
2023-05-30 19:15:38 +03:00 |
|
Georgi Gerganov
|
72256ebd2b
|
mtl : add rms_norm kernel + confirm working
|
2023-05-30 19:03:04 +03:00 |
|
Georgi Gerganov
|
794704e409
|
mtl : confirmed get_rows_q4_0 is working correctly
|
2023-05-30 18:41:45 +03:00 |
|
Georgi Gerganov
|
a8fd9dc128
|
mtl : initial get_rows_q4_0 kernel
|
2023-05-29 23:12:19 +03:00 |
|
Georgi Gerganov
|
248a8c3379
|
mtl : move MSL code into separate file for easy editing
|
2023-05-29 22:26:40 +03:00 |
|
Georgi Gerganov
|
897d6d8e8f
|
mtl : export just a small part of the graph for now to make it easier
|
2023-05-29 21:40:05 +03:00 |
|
Georgi Gerganov
|
a792cbd0fc
|
mtl : no need for mtl-export tool, add cli arg for main instead
|
2023-05-29 21:28:59 +03:00 |
|
Georgi Gerganov
|
b23fe8c9c7
|
mtl : adapt the MNIST example as starter
|
2023-05-29 21:20:56 +03:00 |
|
Georgi Gerganov
|
98c267fc77
|
ci : disable temporary
|
2023-05-29 20:57:24 +03:00 |
|
Georgi Gerganov
|
f85020b19a
|
mtl : export the LLaMA computation graph
|
2023-05-29 20:49:24 +03:00 |
|
Georgi Gerganov
|
7552ac5863
|
ggml : sync cgraph import / export API
|
2023-05-29 19:31:44 +03:00 |
|
Georgi Gerganov
|
5d1830b99d
|
ggml : fix bug in ggml_alibi
|
2023-05-29 19:30:49 +03:00 |
|
DannyDaemonic
|
248367605e
|
Work around for recalculating logits in cached prompts (Fixes #1585) (#1609)
* Work around for recalculating logits in cached prompts
|
2023-05-29 05:13:40 -07:00 |
|
Jiří Podivín
|
0e730dd23b
|
Adding git in container package dependencies (#1621)
Git added to build packages for version information in docker image
Signed-off-by: Jiri Podivin <jpodivin@gmail.com>
|
2023-05-28 21:45:50 -07:00 |
|