Commit graph

3346 commits

Author SHA1 Message Date
hongruichen
ff601abc1c add todo 2024-07-16 00:05:40 +08:00
hongruichen
f32327e2b2 remove multiply declearation of log in unit test 2024-07-15 12:06:12 +08:00
hongruichen
cd5a7331f7 add cpu backend as cross reference 2024-07-15 10:55:17 +08:00
hongruichen
4410fd6563 format with clang-format 2024-07-15 10:30:57 +08:00
hongruichen
c46b4deea9 [unit test] init all tensor by one function 2024-07-15 10:23:19 +08:00
hongruichen
30b40006cc remove unused declarations 2024-07-14 23:50:11 +08:00
hongruichen
148ceab70c add log op 2024-07-14 23:00:50 +08:00
hongruichen
c1e2283887 expose op at unit test 2024-07-13 11:07:06 +08:00
hongruichen
100ccd5e7f add unary op template and more ops 2024-07-13 00:55:34 +08:00
hongruichen
7cbc4fbd8c add mul 2024-07-12 23:26:38 +08:00
hongruichen
e3aa43adbd suppress warning 2024-07-12 23:26:11 +08:00
hongruichen
0eb595cc6e use table to simpilify the op mapping 2024-07-12 23:22:29 +08:00
hongruichen
f0894d897a wip
wip
2024-07-12 19:57:34 +08:00
hongruichen
be3aa9631f use template function directly 2024-07-11 11:18:06 +08:00
hongruichen
8932135fdb add sqrt and mul ops 2024-07-11 00:08:08 +08:00
hongruichen
7ea28a6fac add helper function for binary op 2024-07-10 23:39:03 +08:00
hongruichen
b6f29273f0 add function to get graph from cache 2024-07-10 23:08:32 +08:00
hongruichen
80051cfc4d remove unused variables 2024-07-10 19:57:47 +08:00
hongruichen
b49b501e26 fix sprintf type 2024-07-10 19:48:57 +08:00
hongruichen
3feb574bf0 merge register_rpc_mem into alloc_rpc_mem 2024-07-10 19:40:02 +08:00
hongruichen
e97d3a6c48 fix tensor buffer allocation
add log

commit qnn buffer after changed

add log

register_rpc_mem 2 times

update input tensors before graph finalize

default to QNN_TENSORMEMTYPE_RAW

set new tensors at execute

move write input tensors to exec

check if mem registered before actual do

register rpc mem once allocated
2024-07-10 19:32:39 +08:00
hongruichen
dc7d83e121 add log 2024-07-10 00:33:23 +08:00
hongruichen
9add256efe use helper function instead 2024-07-10 00:31:39 +08:00
hongruichen
a7be0693ba add log 2024-07-10 00:29:43 +08:00
hongruichen
af869fd636 fix compiling error in debug build 2024-07-10 00:23:51 +08:00
Hongrui Chen
5f2e3918f6 refactoring ggml_qnn_tensor 2024-07-09 19:58:46 +08:00
Hongrui Chen
874216b9c8 remove unused members 2024-07-07 22:32:43 +08:00
hongruichen
263ffa962e small opt of the qnn graph config init 2024-07-05 23:07:27 +08:00
hongruichen
4b0f6b0cd6 add helper function to get Qnn_TensorType_t from ggml_tensor 2024-07-05 19:37:58 +08:00
hongruichen
0f2e68713c move tensor related function to utils 2024-07-05 19:02:38 +08:00
hongruichen
58cec14092 reformat 2024-07-05 17:38:54 +08:00
hongruichen
13dc3a02c3 use qnn graph inside add and mul ops 2024-07-05 13:27:16 +08:00
hongruichen
a688ed324b add op param to add_nodes 2024-07-05 13:07:48 +08:00
hongruichen
4b2ee61f62 move graph map to backend object 2024-07-05 11:58:47 +08:00
hongruichen
ca0d999c2a add ggml_qnn_graph 2024-07-05 11:35:18 +08:00
hongruichen
000240cf62 add clang format file and reformating 2024-07-04 23:29:31 +08:00
hongruichen
38f88d5fb1 fix compiling error after merge latest master 2024-07-03 00:13:53 +08:00
hongruichen
8b677d1b2f move qnn backend into sub folder 2024-07-02 19:42:14 +08:00
hongruichen
3808a4c1e0 Merge branch 'master' into dev-refactoring 2024-07-01 22:52:08 +08:00
Roni
0ddeff1023
readme : update tool list (#8209)
* Added gppm to Tool list in README

* Update README.md

---------

Co-authored-by: Georgi Gerganov <ggerganov@gmail.com>
2024-07-01 15:48:16 +03:00
Michael Francis
3840b6f593
nix : enable curl (#8043)
Co-authored-by: Georgi Gerganov <ggerganov@gmail.com>
2024-07-01 14:47:04 +03:00
Georgi Gerganov
257f8e41e2
nix : remove OpenCL remnants (#8235)
* nix : remove OpenCL remnants

* minor : remove parentheses
2024-07-01 14:46:18 +03:00
iacore
694c59cb42
Document BERT support. (#8205)
* Update README.md

document BERT support

* Update README.md
2024-07-01 13:40:58 +02:00
zhentaoyu
197fe6c1d7
[SYCL] Update SYCL-Rope op and Refactor (#8157)
* align with rope.cu and move sycl-op to a single file
2024-07-01 19:39:06 +08:00
Georgi Gerganov
d0a7145ba9
flake.lock: Update (#8218) 2024-06-30 16:09:34 -07:00
Xuan Son Nguyen
9ef0780062
Fix new line issue with chat template, disable template when in-prefix/suffix is set (#8203)
* preserve new line llama_chat_format_single

* disable chat template if in-prefix/suffix is set

* remove redundant change
2024-06-30 20:27:13 +02:00
Andrei
1c5eba6f8e
llama: Add attention and final logit soft-capping, update scaling factor to Gemma2 (#8197)
* Add attention and final logit softcapping.

* fix

* Add custom add_ functions

* Disable flash attention for Gemma2

* Update src/llama.cpp

Co-authored-by: slaren <slarengh@gmail.com>

* Add default value for attention and final logit softcap value

* Add custom kq scaling from Gemma2Attention

* Remove custom pre attention scaling and use computed value instead.

---------

Co-authored-by: slaren <slarengh@gmail.com>
2024-06-29 23:44:08 -04:00
Xuan Son Nguyen
72272b83a3
fix code typo in llama-cli (#8198) 2024-06-29 00:14:20 +02:00
Olivier Chafik
8748d8ac6f
json: attempt to skip slow tests when running under emulator (#8189) 2024-06-28 18:02:05 +01:00
Xuan Son Nguyen
26a39bbd6b
Add MiniCPM, Deepseek V2 chat template + clean up llama_chat_apply_template_internal (#8172)
* tmp_contains

* minicpm chat template

* add DeepSeek Lite template

* change deepseek-lite to deepseek2

* correct code comment

* correct code from master branch
2024-06-28 15:11:44 +02:00